Big data

1
E-book

100 sposobów na Excel 2007 PL. Tworzenie funkcjonalnych arkuszy

Raina Hawley, David Hawley

Wykorzystaj wszystkie możliwości Excela, aby tworzyć funkcjonalne i efektowne arkusze Jak analizować dane i zarządzać nimi? Jak optymalnie wykorzystywać tabele przestawne? Jak tworzyć spersonalizowane wykresy? Większość użytkowników Excela zna tylko drobną część dostępnych w nim możliwości. Istnieje jednak wiele metod pozwalających znacząco poszerzyć wydajność tego programu i tworzyć efektowne arkusze kalkulacyjne bez potrzeby czasochłonnego zdobywania wiedzy. Nowa wersja Excela umożliwia chociażby łatwiejsze korzystanie z mechanizmu tabel przestawnych, formatowania warunkowego i nazw zakresów, podglądu "na żywo", galerii predefiniowanych stylów oraz grafik SmartArt. Wszystko to sprawia, że każdy może dziś w swoich arkuszach używać bardziej skomplikowanych elementów wizualnych i graficznych. Obecnie, kiedy wszystko dzieje się coraz szybciej, czas staje się jedną z najistotniejszych i pożądanych wartości. Książka "100 sposobów na Excel 2007 PL. Tworzenie funkcjonalnych arkuszy" pozwala Ci właśnie zyskać na czasie - oferuje ponad sto gotowych metod tworzenia funkcjonalnych i efektownych arkuszy, szybkich i niezawodnych rozwiązań skomplikowanych problemów. Warto także wykorzystać te metody do przeanalizowania oraz zastosowania niektórych możliwości języka Visual Basic for Applications (VBA) - dzięki temu będziesz mógł zaadaptować wszystkie zawarte tu propozycje do własnych potrzeb. Skoroszyty i arkusze Wbudowane mechanizmy do analizy danych i zarządzania nimi Metody tworzenia nazw i zakresów komórek Tabele przestawne Formuły i funkcje Wykresy i makra Współpraca Excela z innymi aplikacjami pakietu Office Wszystko, co chciałbyś widzieć o Excelu, aby natychmiast z niego skorzystać. Poznaj ponad setkę sposobów skutecznego radzenia sobie z Excelem!

2
E-book

Access. Analiza danych. Receptury

Korzystaj z bazy danych Access jak profesjonalista! Jak stosować wskaźniki statystyczne do analizy danych biznesowych? Jak rozszerzać funkcjonalność zapytań SQL, stosując skrypty VBA? Jak przetwarzać dane i przenosić je między bazami Access? Access to znane już narzędzie służące do wszechstronnego przetwarzania i analizy danych. Posiada sporo ukrytych mechanizmów, pozwalających efektywnie wykonywać zadania, które początkowo mogą wydawać się skomplikowane. Książka przedstawia przykłady kwerend, metody przenoszenia danych pomiędzy bazami Access, obliczania wielu wskaźników finansowo-biznesowych i sporo innych zagadnień - wszystko pod kątem analizy i przetwarzania danych. Każda zaprezentowana receptura jest opatrzona kompletnym opisem rozwiązania problemu wraz ze szczegółowym omówieniem metody postępowania oraz analizą kodu. Access. Analiza danych. Receptury to uniwersalny podręcznik przeznaczony zarówno dla początkujących użytkowników bazy danych Access, jak i doświadczonych. Dzięki przejrzystemu językowi i mnogości poruszonych zagadnień każdy, niezależnie od stopnia zaawansowania, może poszerzyć swoją wiedzę. Zawiera mnóstwo ciekawych wskazówek i technik ułatwiających codzienną pracę z bazami danych, co czyni ją atrakcyjną nawet dla osób doskonale posługujących się bazą Access. Jest to także kompendium wiedzy niezbędnej każdemu, kto chce wyciągać ze zbiorów danych naprawdę cenne informacje. Tworzenie kwerend różnych typów Wstawianie, aktualizacja i usuwanie danych Przetwarzanie tekstu i liczb zapisanych w formie łańcucha znaków Zastosowanie tabel, modyfikacja zawartości systemu Windows, szyfrowanie danych Wykorzystanie obiektu FileSystemObject, przetwarzanie danych XML oraz XSLT, komunikacja z bazami SQL Rozwiązywanie problemów biznesowych Obliczanie wskaźników statystycznych Baza danych to fundament biznesu - zobacz, jak efektywnie nią zarządzać!

3
E-book

Advanced Deep Learning with Keras. Apply deep learning techniques, autoencoders, GANs, variational autoencoders, deep reinforcement learning, policy gradients, and more

Rowel Atienza

Recent developments in deep learning, including Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and Deep Reinforcement Learning (DRL) are creating impressive AI results in our news headlines - such as AlphaGo Zero beating world chess champions, and generative AI that can create art paintings that sell for over $400k because they are so human-like.Advanced Deep Learning with Keras is a comprehensive guide to the advanced deep learning techniques available today, so you can create your own cutting-edge AI. Using Keras as an open-source deep learning library, you'll find hands-on projects throughout that show you how to create more effective AI with the latest techniques.The journey begins with an overview of MLPs, CNNs, and RNNs, which are the building blocks for the more advanced techniques in the book. You’ll learn how to implement deep learning models with Keras and TensorFlow 1.x, and move forwards to advanced techniques, as you explore deep neural network architectures, including ResNet and DenseNet, and how to create autoencoders. You then learn all about GANs, and how they can open new levels of AI performance. Next, you’ll get up to speed with how VAEs are implemented, and you’ll see how GANs and VAEs have the generative power to synthesize data that can be extremely convincing to humans - a major stride forward for modern AI. To complete this set of advanced techniques, you'll learn how to implement DRL such as Deep Q-Learning and Policy Gradient Methods, which are critical to many modern results in AI.

4
E-book

Analiza i prezentacja danych w Microsoft Excel. Vademecum Walkenbacha

John Walkenbach, Michael Alexander

Wykorzystaj możliwości Excela w zarządzaniu! Co to są pulpity menedżerskie? Jak efektownie zaprezentować najważniejsze informacje? Jak zautomatyzować tworzenie raportów? Excel to niezastąpione narzędzie, jeżeli musisz przetworzyć setki, tysiące, a nawet miliony danych. Od wielu lat obecny na rynku, program ten dzięki intuicyjnemu interfejsowi użytkownika, ogromnym możliwościom oraz rozsądnej cenie zdobył popularność w zasadzie w każdym środowisku - począwszy od akademickiego, a skończywszy na menedżerach i prezesach. Ta książka przeznaczona jest dla tej drugiej grupy. Ciągły napływ nowych informacji w świecie biznesu sprawia, że są one trudne do ogarnięcia. Z pomocą przychodzą pulpity menedżerskie! Dzięki tej książce dowiesz się, jak je stworzyć oraz wyłuskać najistotniejsze informacje z morza danych. W trakcie lektury nauczysz się analizować i przedstawiać w użytecznej formie dostępne dane, korzystać z metod szybkiej prezentacji oraz automatyzować procesy raportowania i tworzyć przyciągające wzrok prezentacje. Ponadto biegle opanujesz korzystanie z tabel i wykresów przestawnych oraz tworzenie zaawansowanych komponentów do prezentacji tendencji czy oceny efektywności realizacji celów. Jest to idealna książka dla każdego menedżera tonącego w gąszczu danych! Definicja pulpitów menedżerskich Określanie wymagań użytkowników Zasady projektowania pulpitów menedżerskich Projektowanie modelu danych Wykresy w programie Microsoft Excel Wykorzystanie tabel przestawnych Tworzenie wykresów przestawnych Wykresy przebiegu w czasie Inne techniki wizualizacji danych Tworzenie komponentów do prezentacji tendencji i grupowania danych Prezentacja efektywności realizacji celów Wykorzystanie makr w raportach Dodawanie interaktywnych kontrolek do pulpitu menedżerskiego Importowanie danych z Microsoft Access Metody bezpiecznego współdzielenia danych Uporządkuj i efektownie zaprezentuj najważniejsze informacje!

5
E-book

Analiza i prezentacja danych w Microsoft Excel. Vademecum Walkenbacha. Wydanie II

John Walkenbach, Michael Alexander

Wykorzystaj możliwości Excela w zarządzaniu! Jeżeli masz przed sobą setki, a może tysiące lub miliony danych, z których chcesz wyciągnąć celne wnioski, potrzebujesz narzędzia, które pomoże Ci to ogarnąć. Mowa oczywiście o Excelu. Nieważne, kim jesteś - studentem, księgowym, menedżerem czy prezesem - na 100% docenisz drzemiący w nim potencjał! Dzięki tej książce dowiesz się, jak wyłuskać najistotniejsze informacje z morza danych. W trakcie lektury nauczysz się błyskawicznie przygotowywać raporty oraz prezentacje. Przekonasz się, że tabele przestawne wcale nie muszą być takie straszne, oraz zobaczysz najlepsze techniki prezentacji tendencji czy oceny efektywności w realizacji celów. Kolejne wydanie książki zostało zaktualizowane, ulepszone i rozszerzone o mnóstwo nowych, przydatnych wiadomości. Dowiesz się, jak importować dane z bazy SQL Server oraz jak wykorzystać możliwości dodatku Power View. Książka ta jest idealną pozycją dla tonących w gąszczu danych! Dzięki tej książce: poznasz narzędzia Excela w zakresie analizy i prezentacji danych opanujesz najlepsze techniki projektowania tabel przygotujesz czytelne raporty wykorzystasz w pełni możliwości Excela Uratuj się z morza danych!

6
E-book

Analytics for the Internet of Things (IoT). Intelligent analytics for your intelligent devices

Andrew Minteer

We start with the perplexing task of extracting value from huge amounts of barely intelligible data. The data takes a convoluted route just to be on the servers for analysis, but insights can emerge through visualization and statistical modeling techniques. You will learn to extract value from IoT big data using multiple analytic techniques. Next we review how IoT devices generate data and how the information travels over networks. You’ll get to know strategies to collect and store the data to optimize the potential for analytics, and strategies to handle data quality concerns. Cloud resources are a great match for IoT analytics, so Amazon Web Services, Microsoft Azure, and PTC ThingWorx are reviewed in detail next. Geospatial analytics is then introduced as a way to leverage location information. Combining IoT data with environmental data is also discussed as a way to enhance predictive capability. We’ll also review the economics of IoT analytics and you’ll discover ways to optimize business value.By the end of the book, you’ll know how to handle scale for both data storage and analytics, how Apache Spark can be leveraged to handle scalability, and how R and Python can be used for analytic modeling.

7
E-book

Apache Hadoop 3 Quick Start Guide. Learn about big data processing and analytics

Hrishikesh Vijay Karambelkar

Apache Hadoop is a widely used distributed data platform. It enables large datasets to be efficiently processed instead of using one large computer to store and process the data. This book will get you started with the Hadoop ecosystem, and introduce you to the main technical topics, including MapReduce, YARN, and HDFS.The book begins with an overview of big data and Apache Hadoop. Then, you will set up a pseudo Hadoop development environment and a multi-node enterprise Hadoop cluster. You will see how the parallel programming paradigm, such as MapReduce, can solve many complex data processing problems.The book also covers the important aspects of the big data software development lifecycle, including quality assurance and control, performance, administration, and monitoring. You will then learn about the Hadoop ecosystem, and tools such as Kafka, Sqoop, Flume, Pig, Hive, and HBase. Finally, you will look at advanced topics, including real time streaming using Apache Storm, and data analytics using Apache Spark. By the end of the book, you will be well versed with different configurations of the Hadoop 3 cluster.

8
E-book

Apache Ignite Quick Start Guide. Distributed data caching and processing made easy

Sujoy Acharya

Apache Ignite is a distributed in-memory platform designed to scale and process large volume of data. It can be integrated with microservices as well as monolithic systems, and can be used as a scalable, highly available and performant deployment platform for microservices. This book will teach you to use Apache Ignite for building a high-performance, scalable, highly available system architecture with data integrity.The book takes you through the basics of Apache Ignite and in-memory technologies. You will learn about installation and clustering Ignite nodes, caching topologies, and various caching strategies, such as cache aside, read and write through, and write behind. Next, you will delve into detailed aspects of Ignite’s data grid: web session clustering and querying data.You will learn how to process large volumes of data using compute grid and Ignite’s map-reduce and executor service. You will learn about the memory architecture of Apache Ignite and monitoring memory and caches. You will use Ignite for complex event processing, event streaming, and the time-series predictions of opportunities and threats. Additionally, you will go through off-heap and on-heap caching, swapping, and native and Spring framework integration with Apache Ignite.By the end of this book, you will be confident with all the features of Apache Ignite 2.x that can be used to build a high-performance system architecture.

9
E-book

Apache Kafka 1.0 Cookbook. Over 100 practical recipes on using distributed enterprise messaging to handle real-time data

Raúl Estrada

Apache Kafka provides a unified, high-throughput, low-latency platform to handle real-time data feeds. This book will show you how to use Kafka efficiently, and contains practical solutions to the common problems that developers and administrators usually face while working with it. This practical guide contains easy-to-follow recipes to help you set up, configure, and use Apache Kafka in the best possible manner. You will use Apache Kafka Consumers and Producers to build effective real-time streaming applications. The book covers the recently released Kafka version 1.0, the Confluent Platform and Kafka Streams. The programming aspect covered in the book will teach you how to perform important tasks such as message validation, enrichment and composition.Recipes focusing on optimizing the performance of your Kafka cluster, and integrate Kafka with a variety of third-party tools such as Apache Hadoop, Apache Spark, and Elasticsearch will help ease your day to day collaboration with Kafka greatly. Finally, we cover tasks related to monitoring and securing your Apache Kafka cluster using tools such as Ganglia and Graphite.If you're looking to become the go-to person in your organization when it comes to working with Apache Kafka, this book is the only resource you need to have.

10
E-book

Apache Kafka Quick Start Guide. Leverage Apache Kafka 2.0 to simplify real-time data processing for distributed applications

Raúl Estrada

Apache Kafka is a great open source platform for handling your real-time data pipeline to ensure high-speed filtering and pattern matching on the ?y. In this book, you will learn how to use Apache Kafka for efficient processing of distributed applications and will get familiar with solving everyday problems in fast data and processing pipelines.This book focuses on programming rather than the configuration management of Kafka clusters or DevOps. It starts off with the installation and setting up the development environment, before quickly moving on to performing fundamental messaging operations such as validation and enrichment.Here you will learn about message composition with pure Kafka API and Kafka Streams. You will look into the transformation of messages in different formats, such asext, binary, XML, JSON, and AVRO. Next, you will learn how to expose the schemas contained in Kafka with the Schema Registry. You will then learn how to work with all relevant connectors with Kafka Connect. While working with Kafka Streams, you will perform various interesting operations on streams, such as windowing, joins, and aggregations. Finally, through KSQL, you will learn how to retrieve, insert, modify, and delete data streams, and how to manipulate watermarks and windows.

11
E-book

Apache Spark 2: Data Processing and Real-Time Analytics. Master complex big data processing, stream analytics, and machine learning with Apache Spark

Romeo Kienzler, Md. Rezaul Karim, Sridhar Alla, Siamak Amirghodsi, ...

Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics, machine learning, and more. With this Learning Path, you can take your knowledge of Apache Spark to the next level by learning how to expand Spark's functionality and building your own data flow and machine learning programs on this platform.You will work with the different modules in Apache Spark, such as interactive querying with Spark SQL, using DataFrames and datasets, implementing streaming analytics with Spark Streaming, and applying machine learning and deep learning techniques on Spark using MLlib and various external tools.By the end of this elaborately designed Learning Path, you will have all the knowledge you need to master Apache Spark, and build your own big data processing and analytics pipeline quickly and without any hassle.This Learning Path includes content from the following Packt products:• Mastering Apache Spark 2.x by Romeo Kienzler• Scala and Spark for Big Data Analytics by Md. Rezaul Karim, Sridhar Alla• Apache Spark 2.x Machine Learning Cookbook by Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, Shuen MeiCookbook

12
E-book

Apache Spark 2.x Cookbook. Over 70 cloud-ready recipes for distributed Big Data processing and analytics

Rishi Yadav

While Apache Spark 1.x gained a lot of traction and adoption in the early years, Spark 2.x delivers notable improvements in the areas of API, schema awareness, Performance, Structured Streaming, and simplifying building blocks to build better, faster, smarter, and more accessible big data applications. This book uncovers all these features in the form of structured recipes to analyze and mature large and complex sets of data.Starting with installing and configuring Apache Spark with various cluster managers, you will learn to set up development environments. Further on, you will be introduced to working with RDDs, DataFrames and Datasets to operate on schema aware data, and real-time streaming with various sources such as Twitter Stream and Apache Kafka. You will also work through recipes on machine learning, including supervised learning, unsupervised learning & recommendation engines in Spark.Last but not least, the final few chapters delve deeper into the concepts of graph processing using GraphX, securing your implementations, cluster optimization, and troubleshooting.

13
E-book

Apache Spark 2.x for Java Developers. Explore big data at scale using Apache Spark 2.x Java APIs

Sourav Gulati, Sumit Kumar

Apache Spark is the buzzword in the big data industry right now, especially with the increasing need for real-time streaming and data processing. While Spark is built on Scala, the Spark Java API exposes all the Spark features available in the Scala version for Java developers. This book will show you how you can implement various functionalities of the Apache Spark framework in Java, without stepping out of your comfort zone.The book starts with an introduction to the Apache Spark 2.x ecosystem, followed by explaining how to install and configure Spark, and refreshes the Java concepts that will be useful to you when consuming Apache Spark's APIs. You will explore RDD and its associated common Action and Transformation Java APIs, set up a production-like clustered environment, and work with Spark SQL. Moving on, you will perform near-real-time processing with Spark streaming, Machine Learning analytics with Spark MLlib, and graph processing with GraphX, all using various Java packages.By the end of the book, you will have a solid foundation in implementing components in the Spark framework in Java to build fast, real-time applications.

14
E-book

Apache Spark Quick Start Guide. Quickly learn the art of writing efficient big data applications with Apache Spark

Shrey Mehrotra, Akash Grade

Apache Spark is a ?exible framework that allows processing of batch and real-time data. Its unified engine has made it quite popular for big data use cases. This book will help you to get started with Apache Spark 2.0 and write big data applications for a variety of use cases.It will also introduce you to Apache Spark – one of the most popular Big Data processing frameworks. Although this book is intended to help you get started with Apache Spark, but it also focuses on explaining the core concepts. This practical guide provides a quick start to the Spark 2.0 architecture and its components. It teaches you how to set up Spark on your local machine. As we move ahead, you will be introduced to resilient distributed datasets (RDDs) and DataFrame APIs, and their corresponding transformations and actions. Then, we move on to the life cycle of a Spark application and learn about the techniques used to debug slow-running applications. You will also go through Spark’s built-in modules for SQL, streaming, machine learning, and graph analysis.Finally, the book will lay out the best practices and optimization techniques that are key for writing efficient Spark applications. By the end of this book, you will have a sound fundamental understanding of the Apache Spark framework and you will be able to write and optimize Spark applications.

15
E-book

Apache Superset Quick Start Guide. Develop interactive visualizations by creating user-friendly dashboards

Shashank Shekhar

Apache Superset is a modern, open source, enterprise-ready business intelligence (BI) web application. With the help of this book, you will see how Superset integrates with popular databases like Postgres, Google BigQuery, Snowflake, and MySQL. You will learn to create real time data visualizations and dashboards on modern web browsers for your organization using Superset.First, we look at the fundamentals of Superset, and then get it up and running. You'll go through the requisite installation, configuration, and deployment. Then, we will discuss different columnar data types, analytics, and the visualizations available. You'll also see the security tools available to the administrator to keep your data safe.You will learn how to visualize relationships as graphs instead of coordinates on plain orthogonal axes. This will help you when you upload your own entity relationship dataset and analyze the dataset in new, different ways. You will also see how to analyze geographical regions by working with location data.Finally, we cover a set of tutorials on dashboard designs frequently used by analysts, business intelligence professionals, and developers.

16
E-book

Applied Data Science with Python and Jupyter. Use powerful industry-standard tools to unlock new, actionable insights from your data

Alex Galea

Getting started with data science doesn't have to be an uphill battle. Applied Data Science with Python and Jupyter is a step-by-step guide ideal for beginners who know a little Python and are looking for a quick, fast-paced introduction to these concepts. In this book, you'll learn every aspect of the standard data workflow process, including collecting, cleaning, investigating, visualizing, and modeling data. You'll start with the basics of Jupyter, which will be the backbone of the book. After familiarizing ourselves with its standard features, you'll look at an example of it in practice with our first analysis. In the next lesson, you dive right into predictive analytics, where multiple classification algorithms are implemented. Finally, the book ends by looking at data collection techniques. You'll see how web data can be acquired with scraping techniques and via APIs, and then briefly explore interactive visualizations.

17
E-book

Applied Data Visualization with R and ggplot2. Create useful, elaborate, and visually appealing plots

Dr. Tania Moulik

Applied Data Visualization with R and ggplot2 introduces you to the world of data visualization by taking you through the basic features of ggplot2. To start with, you’ll learn how to set up the R environment, followed by getting insights into the grammar of graphics and geometric objects before you explore the plotting techniques.You’ll discover what layers, scales, coordinates, and themes are, and study how you can use them to transform your data into aesthetical graphs. Once you’ve grasped the basics, you’ll move on to studying simple plots such as histograms and advanced plots such as superimposing and density plots. You’ll also get to grips with plotting trends, correlations, and statistical summaries.By the end of this book, you’ll have created data visualizations that will impress your clients.

18
E-book

Azure Data and AI Architect Handbook. Adopt a structured approach to designing data and AI solutions at scale on Microsoft Azure

Olivier Mertens, Breght Van Baelen

With data’s growing importance in businesses, the need for cloud data and AI architects has never been higher. The Azure Data and AI Architect Handbook is designed to assist any data professional or academic looking to advance their cloud data platform designing skills. This book will help you understand all the individual components of an end-to-end data architecture and how to piece them together into a scalable and robust solution.You’ll begin by getting to grips with core data architecture design concepts and Azure Data & AI services, before exploring cloud landing zones and best practices for building up an enterprise-scale data platform from scratch. Next, you’ll take a deep dive into various data domains such as data engineering, business intelligence, data science, and data governance. As you advance, you’ll cover topics ranging from learning different methods of ingesting data into the cloud to designing the right data warehousing solution, managing large-scale data transformations, extracting valuable insights, and learning how to leverage cloud computing to drive advanced analytical workloads. Finally, you’ll discover how to add data governance, compliance, and security to solutions.By the end of this book, you’ll have gained the expertise needed to become a well-rounded Azure Data & AI architect.

19
E-book

Azure Data Engineer Associate Certification Guide. A hands-on reference guide to developing your data engineering skills and preparing for the DP-203 exam

Newton Alex

Azure is one of the leading cloud providers in the world, providing numerous services for data hosting and data processing. Most of the companies today are either cloud-native or are migrating to the cloud much faster than ever. This has led to an explosion of data engineering jobs, with aspiring and experienced data engineers trying to outshine each other.Gaining the DP-203: Azure Data Engineer Associate certification is a sure-fire way of showing future employers that you have what it takes to become an Azure Data Engineer. This book will help you prepare for the DP-203 examination in a structured way, covering all the topics specified in the syllabus with detailed explanations and exam tips. The book starts by covering the fundamentals of Azure, and then takes the example of a hypothetical company and walks you through the various stages of building data engineering solutions. Throughout the chapters, you'll learn about the various Azure components involved in building the data systems and will explore them using a wide range of real-world use cases. Finally, you’ll work on sample questions and answers to familiarize yourself with the pattern of the exam.By the end of this Azure book, you'll have gained the confidence you need to pass the DP-203 exam with ease and land your dream job in data engineering.

20
E-book

Badanie danych. Raport z pierwszej linii działań

Rachel Schutt, Cathy O'Neil

Unikalne wprowadzenie do nauki o danych! W dzisiejszych czasach najcenniejszym dobrem jest informacja. Ogromne ilości danych są przechowywane w przepastnych bazach danych, a kluczem do sukcesu jest ich umiejętna analiza i wyciąganie wniosków. To dynamicznie rozwijająca się dziedzina wiedzy, w której do tej pory brakowało solidnych podręczników, pozwalających na dogłębne poznanie tego obszaru. Na szczęście to się zmieniło! To unikalna książka, w której badacze z największych firm branży IT dzielą się skutecznymi technikami analizy danych. Z kolejnych rozdziałów dowiesz się, czym jest nauka o danych, model danych oraz test A/B. Ponadto zdobędziesz wiedzę na temat wnioskowania statystycznego, algorytmów, języka R oraz wizualizacji danych. Sięgnij po tę książkę, jeżeli chcesz się dowiedzieć, jak wykrywać oszustwa, korzystać z MapReduce oraz badać przyczynowość. To obowiązkowa pozycja na półce czytelników zainteresowanych badaniem danych. Wśród tematów poruszonych w książce odnajdziesz: Wnioskowanie statystyczne, eksploracyjną analizę danych i proces (metodologię) nauki o danych Algorytmy Filtry spamu, naiwny algorytm Bayesa i wstępną obróbkę danych Regresję logistyczną Modelowanie finansowe Mechanizmy rekomendacji i przyczynowość Wizualizowanie danych Sieci społecznościowe i dziennikarstwo danych Inżynierię danych, systemy MapReduce, Pregel i Hadoop Wyciągnij wartościowe wnioski z posiadanych informacji!

21
E-book

Bayesian Analysis with Python. Introduction to statistical modeling and probabilistic programming using PyMC3 and ArviZ - Second Edition

Osvaldo Martin

The second edition of Bayesian Analysis with Python is an introduction to the main concepts of applied Bayesian inference and its practical implementation in Python using PyMC3, a state-of-the-art probabilistic programming library, and ArviZ, a new library for exploratory analysis of Bayesian models.The main concepts of Bayesian statistics are covered using a practical and computational approach. Synthetic and real data sets are used to introduce several types of models, such as generalized linear models for regression and classification, mixture models, hierarchical models, and Gaussian processes, among others. By the end of the book, you will have a working knowledge of probabilistic modeling and you will be able to design and implement Bayesian models for your own data science problems. After reading the book you will be better prepared to delve into more advanced material or specialized statistical modeling if you need to.

22
E-book

Become a Python Data Analyst. Perform exploratory data analysis and gain insight into scientific computing using Python

Alvaro Fuentes

Python is one of the most common and popular languages preferred by leading data analysts and statisticians for working with massive datasets and complex data visualizations.Become a Python Data Analyst introduces Python’s most essential tools and libraries necessary to work with the data analysis process, right from preparing data to performing simple statistical analyses and creating meaningful data visualizations.In this book, we will cover Python libraries such as NumPy, pandas, matplotlib, seaborn, SciPy, and scikit-learn, and apply them in practical data analysis and statistics examples. As you make your way through the chapters, you will learn to efficiently use the Jupyter Notebook to operate and manipulate data using NumPy and the pandas library. In the concluding chapters, you will gain experience in building simple predictive models and carrying out statistical computation and analysis using rich Python tools and proven data analysis techniques.By the end of this book, you will have hands-on experience performing data analysis with Python.

23
E-book

Big Data Analytics with Hadoop 3. Build highly effective analytics solutions to gain valuable insight into your big data

Sridhar Alla

Apache Hadoop is the most popular platform for big data processing, and can be combined with a host of other big data tools to build powerful analytics solutions. Big Data Analytics with Hadoop 3 shows you how to do just that, by providing insights into the software as well as its benefits with the help of practical examples.Once you have taken a tour of Hadoop 3’s latest features, you will get an overview of HDFS, MapReduce, and YARN, and how they enable faster, more efficient big data processing. You will then move on to learning how to integrate Hadoop with the open source tools, such as Python and R, to analyze and visualize data and perform statistical computing on big data. As you get acquainted with all this, you will explore how to use Hadoop 3 with Apache Spark and Apache Flink for real-time data analytics and stream processing. In addition to this, you will understand how to use Hadoop to build analytics solutions on the cloud and an end-to-end pipeline to perform big data analysis using practical use cases.By the end of this book, you will be well-versed with the analytical capabilities of the Hadoop ecosystem. You will be able to build powerful solutions to perform big data analytics and get insight effortlessly.

24
E-book

Big Data Architect's Handbook. A guide to building proficiency in tools and systems used by leading big data experts

Syed Muhammad Fahad Akhtar

The big data architects are the “masters” of data, and hold high value in today’s market. Handling big data, be it of good or bad quality, is not an easy task. The prime job for any big data architect is to build an end-to-end big data solution that integrates data from different sources and analyzes it to find useful, hidden insights.Big Data Architect’s Handbook takes you through developing a complete, end-to-end big data pipeline, which will lay the foundation for you and provide the necessary knowledge required to be an architect in big data. Right from understanding the design considerations to implementing a solid, efficient, and scalable data pipeline, this book walks you through all the essential aspects of big data. It also gives you an overview of how you can leverage the power of various big data tools such as Apache Hadoop and ElasticSearch in order to bring them together and build an efficient big data solution.By the end of this book, you will be able to build your own design system which integrates, maintains, visualizes, and monitors your data. In addition, you will have a smooth design flow in each process, putting insights in action.