Analiza danych
Rapid - Apache Mahout Clustering designs. Explore clustering algorithms used with Apache Mahout
Ashish Gupta
As more and more organizations are discovering the use of big data analytics, interest in platforms that provide storage, computation, and analytic capabilities has increased. Apache Mahout caters to this need and paves the way for the implementation of complex algorithms in the field of machine learning to better analyse your data and get useful insights into it.Starting with the introduction of clustering algorithms, this book provides an insight into Apache Mahout and different algorithms it uses for clustering data. It provides a general introduction of the algorithms, such as K-Means, Fuzzy K-Means, StreamingKMeans, and how to use Mahout to cluster your data using a particular algorithm. You will study the different types of clustering and learn how to use Apache Mahout with real world data sets to implement and evaluate your clusters.This book will discuss about cluster improvement and visualization using Mahout APIs and also explore model-based clustering and topic modelling using Dirichlet process. Finally, you will learn how to build and deploy a model for production use.
Raportowanie w System Center Configuration Manager Bez tajemnic
Garth Jones, Dan Toll, Kerrie Meyler
Baza danych SQL Server programu Microsoft System Center Configuration Manager (ConfigMgr) zawiera wiele cennych informacji na temat Twoich użytkowników, komputerów, sprzętu, systemów operacyjnych, aplikacji czy stanu zgodności. Aby umożliwić Ci efektywne wyodrębnianie tych danych, Microsoft dostarczył kilku doskonałych narzędzi, wliczając w to usługi raportowania SQL Server Reporting Services (SSRS) i dodatek SQL Server Data Tools Business Intelligence (SSDT-BI). Podręcznik Raportowanie w System Center Configuration Manager bez tajemnic pokaże Ci, w jaki sposób możesz wykorzystać maksymalny potencjał tych narzędzi. Światowej sławy guru raportowania, Garth Jones, wraz z będącymi ekspertami współautorami tego przewodnika poprowadzi Cię przez wszystkie aspekty niestandardowego raportowania w System Center. Poczynając od instalacji i konfiguracji usług SSRS, krok po kroku nauczysz się wykorzystywać widoki języka SQL do wyszukiwania potrzebnych Ci danych, budować zapytania SQL, tworzyć proste i zaawansowane raporty, a także wykorzystywać administrację opartą na rolach do bezpiecznego dostarczania tych raportów właściwym osobom. W książce tej Jones zebrał aktualne, niezawodne i wszechstronne techniki raportowania w System Center, których na próżno szukać w innych podręcznikach i witrynach internetowych. Korzystając z tego przewodnika będziesz w stanie konsekwentnie pozyskiwać właściwe informacje, które pozwolą Ci rozwiązywać palące problemy i szybko reagować na ewentualne obawy zarządu. Garth Jones, główny architekt w Enhansoft i Microsoft MVP, specjalizuje się w poszerzaniu wartości i znaczenia programu System Center Configuration Manager. Z rodziną produktów System Center pracuje od roku 1996, kiedy to występowała jeszcze pod nazwą SMS. Dan Toll jest administratorem programu Configuration Manager, z którym pracuje od wersji SMS 2003. Specjalizuje się we wdrożeniach systemów operacyjnych dla stacji roboczych i serwerów przy użyciu narzędzi Microsoft Deployment Toolkit (MDT) oraz w raportowaniu w programie ConfigMgr. Kerrie Meyler, Microsoft MVP, jest wiodącą autorką wielu książek z serii System Center Unleashed. Obecnie pracuje jako niezależny konsultant. W czasie trwającej ponad 17 lat kariery zawodowej ewangelizowała produkt SMS na stanowisku starszego specjalisty technologii w Microsoft i prezentowała technologie System Center na konferencjach TechEd i MMS. Szczegółowe informacje na temat Instalowania i konfigurowania usług SSRS pod kątem optymalnego raportowania w System Center i łatwiej-szego rozwiązywania problemów Danych przechowywanych w bazie lokacji programu ConfigMgr Wydajnego pozyskiwania danych programu ConfigMgr poprzez tworzenie zapytań SQL z poziomu SQL Server Management Studio Najlepszych praktyk w zakresie tworzenia i projektowania raportów w System Center Tworzenia szablonów raportów, dostosowywania treści z użyciem parametrów raportów oraz zagnieżdżania wykresów Dostosowywania logo, palet kolorów i pozostałych elementów raportów na potrzeby konkretnej organizacji Konstruowania zaawansowanych metod przeglądania szczegółowego w celu dostarczenia dodatkowych informacji Wzmacniania zabezpieczeń raportów poprzez integrowanie administracji programu ConfigMgr opartej na rolach w zapytaniach SQL Wykorzystywania raportowania do pomiaru kluczowych wskaźników wydajności i pogłębiania wiedzy na temat własnego środowiska Dostosowywania raportów do potrzeb użytkowników końcowych lub zarządu W SIECI: Wszystkie zaprezentowane w tej książce przykłady i skrypty dostępne są do pobrania na stronie informit.com/title/9780672337789
Rivu Chakraborty
In today's app-driven era, when programs are asynchronous, and responsiveness is so vital, reactive programming can help you write code that's more reliable, easier to scale, and better-performing. Reactive programming is revolutionary.With this practical book, Kotlin developers will first learn how to view problems in the reactive way, and then build programs that leverage the best features of this exciting new programming paradigm. You will begin with the general concepts of Reactive programming and then gradually move on to working with asynchronous data streams. You will dive into advanced techniques such as manipulating time in data-flow, customizing operators and provider and how to use the concurrency model to control asynchronicity of code and process event handlers effectively.You will then be introduced to functional reactive programming and will learn to apply FRP in practical use cases in Kotlin. This book will also take you one step forward by introducing you to Spring 5 and Spring Boot 2 using Kotlin. By the end of the book, you will be able to build real-world applications with reactive user interfaces as well as you'll learn to implement reactive programming paradigms in Android.
Real-Time Big Data Analytics. Design, process, and analyze large sets of complex data in real time
Shilpi Saxena
Enterprise has been striving hard to deal with the challenges of data arriving in real time or near real time.Although there are technologies such as Storm and Spark (and many more) that solve the challenges of real-time data, using the appropriate technology/framework for the right business use case is the key to success. This book provides you with the skills required to quickly design, implement and deploy your real-time analytics using real-world examples of big data use cases.From the beginning of the book, we will cover the basics of varied real-time data processing frameworks and technologies. We will discuss and explain the differences between batch and real-time processing in detail, and will also explore the techniques and programming concepts using Apache Storm.Moving on, we’ll familiarize you with “Amazon Kinesis” for real-time data processing on cloud. We will further develop your understanding of real-time analytics through a comprehensive review of Apache Spark along with the high-level architecture and the building blocks of a Spark program. You will learn how to transform your data, get an output from transformations, and persist your results using Spark RDDs, using an interface called Spark SQL to work with Spark.At the end of this book, we will introduce Spark Streaming, the streaming library of Spark, and will walk you through the emerging Lambda Architecture (LA), which provides a hybrid platform for big data processing by combining real-time and precomputed batch data to provide a near real-time view of incoming data.
Redash v5 Quick Start Guide. Create and share interactive dashboards using Redash
Alexander Leibzon, Yael Leibzon
Data exploration and visualization is vital to Business Intelligence, the backbone of almost every enterprise or organization. Redash is a querying and visualization tool developed to simplify how marketing and business development departments are exposed to data. If you want to learn to create interactive dashboards with Redash, explore different visualizations, and share the insights with your peers, then this is the ideal book for you.The book starts with essential Business Intelligence concepts that are at the heart of data visualizations. You will learn how to find your way round Redash and its rich array of data visualization options for building interactive dashboards. You will learn how to create data storytelling and share these with peers. You will see how to connect to different data sources to process complex data, and then visualize this data to reveal valuable insights. By the end of this book, you will be confident with the Redash dashboarding tool to provide insight and communicate data storytelling.
Luca Massaron, Alberto Boschetti
Regression is the process of learning relationships between inputs and continuous outputs from example data, which enables predictions for novel inputs. There are many kinds of regression algorithms, and the aim of this book is to explain which is the right one to use for each set of problems and how to prepare real-world data for it. With this book you will learn to define a simple regression problem and evaluate its performance. The book will help you understand how to properly parse a dataset, clean it, and create an output matrix optimally built for regression. You will begin with a simple regression algorithm to solve some data science problems and then progress to more complex algorithms. The book will enable you to use regression models to predict outcomes and take critical business decisions. Through the book, you will gain knowledge to use Python for building fast better linear models and to apply the results in Python or in any computer language you prefer.
Giuseppe Ciaburro, Pierre Paquay, Manoj Kumar, Shaikh...
Regression analysis is a statistical process which enables prediction of relationships between variables. The predictions are based on the casual effect of one variable upon another. Regression techniques for modeling and analyzing are employed on large set of data in order to reveal hidden relationship among the variables.This book will give you a rundown explaining what regression analysis is, explaining you the process from scratch. The first few chapters give an understanding of what the different types of learning are – supervised and unsupervised, how these learnings differ from each other. We then move to covering the supervised learning in details covering the various aspects of regression analysis. The outline of chapters are arranged in a way that gives a feel of all the steps covered in a data science process – loading the training dataset, handling missing values, EDA on the dataset, transformations and feature engineering, model building, assessing the model fitting and performance, and finally making predictions on unseen datasets. Each chapter starts with explaining the theoretical concepts and once the reader gets comfortable with the theory, we move to the practical examples to support the understanding. The practical examples are illustrated using R code including the different packages in R such as R Stats, Caret and so on. Each chapter is a mix of theory and practical examples.By the end of this book you will know all the concepts and pain-points related to regression analysis, and you will be able to implement your learning in your projects.
Svetlana Karslioglu
Pachyderm is an open source project that enables data scientists to run reproducible data pipelines and scale them to an enterprise level. This book will teach you how to implement Pachyderm to create collaborative data science workflows and reproduce your ML experiments at scale.You’ll begin your journey by exploring the importance of data reproducibility and comparing different data science platforms. Next, you’ll explore how Pachyderm fits into the picture and its significance, followed by learning how to install Pachyderm locally on your computer or a cloud platform of your choice. You’ll then discover the architectural components and Pachyderm's main pipeline principles and concepts. The book demonstrates how to use Pachyderm components to create your first data pipeline and advances to cover common operations involving data, such as uploading data to and from Pachyderm to create more complex pipelines. Based on what you've learned, you'll develop an end-to-end ML workflow, before trying out the hyperparameter tuning technique and the different supported Pachyderm language clients. Finally, you’ll learn how to use a SaaS version of Pachyderm with Pachyderm Notebooks.By the end of this book, you will learn all aspects of running your data pipelines in Pachyderm and manage them on a day-to-day basis.