EbookiAnaliza danych

Analiza danych

Analiza danych jest ekscytującą dyscypliną, która umożliwia zrozumienie pewnych zjawisk, uzyskanie wglądu i wiedzy na podstawie surowych danych. Pojęcie to oznacza dokładnie przetwarzanie danych za pomocą technik matematycznych i statystycznych w celu uzyskania cennych wniosków, podjęcia ważnych decyzji i opracowania przydatnych produktów. Termin ten wywodzi się od angielskiego data science, często traktowanego jako synonim takich terminów, jak analityka biznesowa, badania operacyjne, business intelligence, wywiad konkurencyjny, analiza i modelowanie danych, a także pozyskiwanie wiedzy. Dzięki takim technologiom, jak języki Python czy R, platformy Hadoop i Spark masz szansę wyciągnąć maksimum wniosków, dostrzec szanse na rozwój swojej organizacji albo przewidzieć i zapobiec zagrożeniom.

siatka lista

649

EBOOK

RAG from First Principles. Engineering retrieval-augmented generation systems with Python, LangChain, and LlamaIndex

Jia Huang

Most developers can spin up a RAG pipeline in an afternoon using LangChain or LlamaIndex. Far fewer understand why retrieval fails or how to fix it. This book is for those who want to go deeper.'RAG From First Principles' dismantles the retrieval-augmented generation stack layer by layer, how documents are ingested and parsed, why chunking strategy directly impacts answer quality, how embedding models encode meaning, what happens inside a vector database, and how sparse and dense retrieval interact in a hybrid system. Written by Jia Huang, a research engineer and bestselling AI author, it brings research depth and production experience to one of AI's most critical engineering disciplines.Structured as a progressive dialogue between a seasoned engineer and two students, the book surfaces the questions practitioners actually ask. Each chapter builds on the last, from data import and chunking through embedding selection, index design, hybrid search, and post-retrieval processing, into response generation, evaluation, and advanced paradigms including GraphRAG, Agentic RAG, and Modular RAG.By the end, you'll have the architectural understanding to optimize, debug, and extend your RAG systems with confidence.

650

EBOOK

Rapid - Apache Mahout Clustering designs. Explore clustering algorithms used with Apache Mahout

Ashish Gupta

As more and more organizations are discovering the use of big data analytics, interest in platforms that provide storage, computation, and analytic capabilities has increased. Apache Mahout caters to this need and paves the way for the implementation of complex algorithms in the field of machine learning to better analyse your data and get useful insights into it.Starting with the introduction of clustering algorithms, this book provides an insight into Apache Mahout and different algorithms it uses for clustering data. It provides a general introduction of the algorithms, such as K-Means, Fuzzy K-Means, StreamingKMeans, and how to use Mahout to cluster your data using a particular algorithm. You will study the different types of clustering and learn how to use Apache Mahout with real world data sets to implement and evaluate your clusters.This book will discuss about cluster improvement and visualization using Mahout APIs and also explore model-based clustering and topic modelling using Dirichlet process. Finally, you will learn how to build and deploy a model for production use.

651

EBOOK

Raportowanie w System Center Configuration Manager Bez tajemnic

Garth Jones, Dan Toll, Kerrie Meyler

Baza danych SQL Server programu Microsoft System Center Configuration Manager (ConfigMgr) zawiera wiele cennych informacji na temat Twoich użytkowników, komputerów, sprzętu, systemów operacyjnych, aplikacji czy stanu zgodności. Aby umożliwić Ci efektywne wyodrębnianie tych danych, Microsoft dostarczył kilku doskonałych narzędzi, wliczając w to usługi raportowania SQL Server Reporting Services (SSRS) i dodatek SQL Server Data Tools Business Intelligence (SSDT-BI). Podręcznik Raportowanie w System Center Configuration Manager bez tajemnic pokaże Ci, w jaki sposób możesz wykorzystać maksymalny potencjał tych narzędzi. Światowej sławy guru raportowania, Garth Jones, wraz z będącymi ekspertami współautorami tego przewodnika poprowadzi Cię przez wszystkie aspekty niestandardowego raportowania w System Center. Poczynając od instalacji i konfiguracji usług SSRS, krok po kroku nauczysz się wykorzystywać widoki języka SQL do wyszukiwania potrzebnych Ci danych, budować zapytania SQL, tworzyć proste i zaawansowane raporty, a także wykorzystywać administrację opartą na rolach do bezpiecznego dostarczania tych raportów właściwym osobom. W książce tej Jones zebrał aktualne, niezawodne i wszechstronne techniki raportowania w System Center, których na próżno szukać w innych podręcznikach i witrynach internetowych. Korzystając z tego przewodnika będziesz w stanie konsekwentnie pozyskiwać właściwe informacje, które pozwolą Ci rozwiązywać palące problemy i szybko reagować na ewentualne obawy zarządu. Garth Jones, główny architekt w Enhansoft i Microsoft MVP, specjalizuje się w poszerzaniu wartości i znaczenia programu System Center Configuration Manager. Z rodziną produktów System Center pracuje od roku 1996, kiedy to występowała jeszcze pod nazwą SMS. Dan Toll jest administratorem programu Configuration Manager, z którym pracuje od wersji SMS 2003. Specjalizuje się we wdrożeniach systemów operacyjnych dla stacji roboczych i serwerów przy użyciu narzędzi Microsoft Deployment Toolkit (MDT) oraz w raportowaniu w programie ConfigMgr. Kerrie Meyler, Microsoft MVP, jest wiodącą autorką wielu książek z serii System Center Unleashed. Obecnie pracuje jako niezależny konsultant. W czasie trwającej ponad 17 lat kariery zawodowej ewangelizowała produkt SMS na stanowisku starszego specjalisty technologii w Microsoft i prezentowała technologie System Center na konferencjach TechEd i MMS. Szczegółowe informacje na temat Instalowania i konfigurowania usług SSRS pod kątem optymalnego raportowania w System Center i łatwiej-szego rozwiązywania problemów Danych przechowywanych w bazie lokacji programu ConfigMgr Wydajnego pozyskiwania danych programu ConfigMgr poprzez tworzenie zapytań SQL z poziomu SQL Server Management Studio Najlepszych praktyk w zakresie tworzenia i projektowania raportów w System Center Tworzenia szablonów raportów, dostosowywania treści z użyciem parametrów raportów oraz zagnieżdżania wykresów Dostosowywania logo, palet kolorów i pozostałych elementów raportów na potrzeby konkretnej organizacji Konstruowania zaawansowanych metod przeglądania szczegółowego w celu dostarczenia dodatkowych informacji Wzmacniania zabezpieczeń raportów poprzez integrowanie administracji programu ConfigMgr opartej na rolach w zapytaniach SQL Wykorzystywania raportowania do pomiaru kluczowych wskaźników wydajności i pogłębiania wiedzy na temat własnego środowiska Dostosowywania raportów do potrzeb użytkowników końcowych lub zarządu W SIECI: Wszystkie zaprezentowane w tej książce przykłady i skrypty dostępne są do pobrania na stronie informit.com/title/9780672337789

652

EBOOK

Reactive Programming in Kotlin. Design and build non-blocking, asynchronous Kotlin applications with RXKotlin, Reactor-Kotlin, Android, and Spring

Rivu Chakraborty

In today's app-driven era, when programs are asynchronous, and responsiveness is so vital, reactive programming can help you write code that's more reliable, easier to scale, and better-performing. Reactive programming is revolutionary.With this practical book, Kotlin developers will first learn how to view problems in the reactive way, and then build programs that leverage the best features of this exciting new programming paradigm. You will begin with the general concepts of Reactive programming and then gradually move on to working with asynchronous data streams. You will dive into advanced techniques such as manipulating time in data-flow, customizing operators and provider and how to use the concurrency model to control asynchronicity of code and process event handlers effectively.You will then be introduced to functional reactive programming and will learn to apply FRP in practical use cases in Kotlin. This book will also take you one step forward by introducing you to Spring 5 and Spring Boot 2 using Kotlin. By the end of the book, you will be able to build real-world applications with reactive user interfaces as well as you'll learn to implement reactive programming paradigms in Android.

653

EBOOK

Real-Time Big Data Analytics. Design, process, and analyze large sets of complex data in real time

Shilpi Saxena

Enterprise has been striving hard to deal with the challenges of data arriving in real time or near real time.Although there are technologies such as Storm and Spark (and many more) that solve the challenges of real-time data, using the appropriate technology/framework for the right business use case is the key to success. This book provides you with the skills required to quickly design, implement and deploy your real-time analytics using real-world examples of big data use cases.From the beginning of the book, we will cover the basics of varied real-time data processing frameworks and technologies. We will discuss and explain the differences between batch and real-time processing in detail, and will also explore the techniques and programming concepts using Apache Storm.Moving on, we’ll familiarize you with “Amazon Kinesis” for real-time data processing on cloud. We will further develop your understanding of real-time analytics through a comprehensive review of Apache Spark along with the high-level architecture and the building blocks of a Spark program. You will learn how to transform your data, get an output from transformations, and persist your results using Spark RDDs, using an interface called Spark SQL to work with Spark.At the end of this book, we will introduce Spark Streaming, the streaming library of Spark, and will walk you through the emerging Lambda Architecture (LA), which provides a hybrid platform for big data processing by combining real-time and precomputed batch data to provide a near real-time view of incoming data.

654

EBOOK

Redash v5 Quick Start Guide. Create and share interactive dashboards using Redash

Alexander Leibzon, Yael Leibzon

Data exploration and visualization is vital to Business Intelligence, the backbone of almost every enterprise or organization. Redash is a querying and visualization tool developed to simplify how marketing and business development departments are exposed to data. If you want to learn to create interactive dashboards with Redash, explore different visualizations, and share the insights with your peers, then this is the ideal book for you.The book starts with essential Business Intelligence concepts that are at the heart of data visualizations. You will learn how to find your way round Redash and its rich array of data visualization options for building interactive dashboards. You will learn how to create data storytelling and share these with peers. You will see how to connect to different data sources to process complex data, and then visualize this data to reveal valuable insights. By the end of this book, you will be confident with the Redash dashboarding tool to provide insight and communicate data storytelling.

655

EBOOK

Regression Analysis with Python. Discover everything you need to know about the art of regression analysis with Python, and change how you view data

Luca Massaron, Alberto Boschetti

Regression is the process of learning relationships between inputs and continuous outputs from example data, which enables predictions for novel inputs. There are many kinds of regression algorithms, and the aim of this book is to explain which is the right one to use for each set of problems and how to prepare real-world data for it. With this book you will learn to define a simple regression problem and evaluate its performance. The book will help you understand how to properly parse a dataset, clean it, and create an output matrix optimally built for regression. You will begin with a simple regression algorithm to solve some data science problems and then progress to more complex algorithms. The book will enable you to use regression models to predict outcomes and take critical business decisions. Through the book, you will gain knowledge to use Python for building fast better linear models and to apply the results in Python or in any computer language you prefer.

656

EBOOK

Regression Analysis with R. Design and develop statistical nodes to identify unique relationships within data at scale

Giuseppe Ciaburro, Pierre Paquay, Manoj Kumar, Shaikh...

Regression analysis is a statistical process which enables prediction of relationships between variables. The predictions are based on the casual effect of one variable upon another. Regression techniques for modeling and analyzing are employed on large set of data in order to reveal hidden relationship among the variables.This book will give you a rundown explaining what regression analysis is, explaining you the process from scratch. The first few chapters give an understanding of what the different types of learning are – supervised and unsupervised, how these learnings differ from each other. We then move to covering the supervised learning in details covering the various aspects of regression analysis. The outline of chapters are arranged in a way that gives a feel of all the steps covered in a data science process – loading the training dataset, handling missing values, EDA on the dataset, transformations and feature engineering, model building, assessing the model fitting and performance, and finally making predictions on unseen datasets. Each chapter starts with explaining the theoretical concepts and once the reader gets comfortable with the theory, we move to the practical examples to support the understanding. The practical examples are illustrated using R code including the different packages in R such as R Stats, Caret and so on. Each chapter is a mix of theory and practical examples.By the end of this book you will know all the concepts and pain-points related to regression analysis, and you will be able to implement your learning in your projects.