Big data - Ebooki - Biblioteka BIBLIO | BIBLIO ebookpoint

945

EBOOK

Sencha Charts Essentials. Create stunning charts and visualizations for both web and mobile applications

Ajit Kumar

If you are an Ext JS or Sencha Touch developer, designer, or architect who wants to build enterprise-scale data visualization capabilities using Sencha, then this book is ideal for you. You should have a knowledge of HTML, JavaScript, CSS, and Sencha Ext JS or Sencha Touch fundamentals, in particular. Some familiarity with SVG and HTML5 Canvas would be preferred, but not required.

946

EBOOK

Serverless ETL and Analytics with AWS Glue. Your comprehensive reference guide to learning about AWS Glue and its features

Vishal Pathak, Subramanya Vajiraya, Noritaka Sekiyama, Tomohiro...

Organizations these days have gravitated toward services such as AWS Glue that undertake undifferentiated heavy lifting and provide serverless Spark, enabling you to create and manage data lakes in a serverless fashion. This guide shows you how AWS Glue can be used to solve real-world problems along with helping you learn about data processing, data integration, and building data lakes.Beginning with AWS Glue basics, this book teaches you how to perform various aspects of data analysis such as ad hoc queries, data visualization, and real-time analysis using this service. It also provides a walk-through of CI/CD for AWS Glue and how to shift left on quality using automated regression tests. You’ll find out how data security aspects such as access control, encryption, auditing, and networking are implemented, as well as getting to grips with useful techniques such as picking the right file format, compression, partitioning, and bucketing. As you advance, you’ll discover AWS Glue features such as crawlers, Lake Formation, governed tables, lineage, DataBrew, Glue Studio, and custom connectors. The concluding chapters help you to understand various performance tuning, troubleshooting, and monitoring options.By the end of this AWS book, you’ll be able to create, manage, troubleshoot, and deploy ETL pipelines using AWS Glue.

947

EBOOK

Seven NoSQL Databases in a Week. Get up and running with the fundamentals and functionalities of seven of the most popular NoSQL databases

Sudarshan Kadambi, Xun (Brian) Wu

This is the golden age of open source NoSQL databases. With enterprises having to work with large amounts of unstructured data and moving away from expensive monolithic architecture, the adoption of NoSQL databases is rapidly increasing. Being familiar with the popular NoSQL databases and knowing how to use them is a must for budding DBAs and developers.This book introduces you to the different types of NoSQL databases and gets you started with seven of the most popular NoSQL databases used by enterprises today. We start off with a brief overview of what NoSQL databases are, followed by an explanation of why and when to use them. The book then covers the seven most popular databases in each of these categories: MongoDB, Amazon DynamoDB, Redis, HBase, Cassandra, In?uxDB, and Neo4j. The book doesn't go into too much detail about each database but teachesyou enough to get started with them.By the end of this book, you will have a thorough understanding of the different NoSQL databases and their functionalities, empowering you to select and use the rightdatabase according to your needs.

948

EBOOK

Siatka danych. Nowoczesna koncepcja samoobsługowej infrastruktury danych

Zhamak Dehghani

Dostęp do danych jest warunkiem rozwoju niejednej organizacji. Aby w pełni skorzystać z ich potencjału i uzyskać dzięki nim konkretną wartość, konieczne jest odpowiednie zarządzanie danymi. Obecnie stosowane rozwiązania w tym zakresie nie nadążają już za złożonością dzisiejszych organizacji, rozprzestrzenianiem się źródeł danych i rosnącymi aspiracjami inżynierów, którzy rozwijają techniki sztucznej inteligencji i analizy danych. Odpowiedzią na te potrzeby może być siatka danych (Data Mesh), jednak praktyczna implementacja tej koncepcji wymaga istotnej zmiany myślenia. Ta książka szczegółowo wyjaśnia paradygmat siatki danych, a przy tym koncentruje się na jego praktycznym zastosowaniu. Zgodnie z tym nowatorskim podejściem dane należy traktować jako produkt, a dziedziny - jako główne zagadnienie. Poza wyjaśnieniem paradygmatu opisano tu zasady projektowania wysokopoziomowej architektury komponentów siatki danych, a także przedstawiono wskazówki i porady dotyczące ewolucyjnej realizacji siatki danych w organizacji. Tematyka ta została potraktowana wszechstronnie: omówiono kwestie technologiczne, organizacyjne, jak również socjologiczne i kulturowe. Dzięki temu jest to cenna lektura zarówno dla architektów i inżynierów, jak i dla badaczy, analityków danych, wreszcie dla liderów i kierowników zespołów. W książce: wyczerpujące wprowadzenie do paradygmatu siatki danych siatka danych i jej komponenty projektowanie architektury siatki danych opracowywanie i realizacja strategii siatki danych zdecentralizowany model własności danych przejście z hurtowni i jezior danych do rozproszonej siatki danych Siatka danych: kolejny etap rozwoju technologii big data!

949

EBOOK

Simplify Big Data Analytics with Amazon EMR. A beginner's guide to learning and implementing Amazon EMR for building data analytics solutions

Sakti Mishra

Amazon EMR, formerly Amazon Elastic MapReduce, provides a managed Hadoop cluster in Amazon Web Services (AWS) that you can use to implement batch or streaming data pipelines. By gaining expertise in Amazon EMR, you can design and implement data analytics pipelines with persistent or transient EMR clusters in AWS.This book is a practical guide to Amazon EMR for building data pipelines. You'll start by understanding the Amazon EMR architecture, cluster nodes, features, and deployment options, along with their pricing. Next, the book covers the various big data applications that EMR supports. You'll then focus on the advanced configuration of EMR applications, hardware, networking, security, troubleshooting, logging, and the different SDKs and APIs it provides. Later chapters will show you how to implement common Amazon EMR use cases, including batch ETL with Spark, real-time streaming with Spark Streaming, and handling UPSERT in S3 Data Lake with Apache Hudi. Finally, you'll orchestrate your EMR jobs and strategize on-premises Hadoop cluster migration to EMR. In addition to this, you'll explore best practices and cost optimization techniques while implementing your data analytics pipeline in EMR.By the end of this book, you'll be able to build and deploy Hadoop- or Spark-based apps on Amazon EMR and also migrate your existing on-premises Hadoop workloads to AWS.

950

EBOOK

Simplifying Android Development with Coroutines and Flows. Learn how to use Kotlin coroutines and the flow API to handle data streams asynchronously in your Android app

Jomar Tigcal

Coroutines and flows are the new recommended way for developers to carry out asynchronous programming in Android using simple, modern, and testable code. This book will teach you how coroutines and flows work and how to use them in building Android applications, along with helping you to develop modern Android applications with asynchronous programming using real data.The book begins by showing you how to create and handle Kotlin coroutines on Android. You’ll explore asynchronous programming in Kotlin, and understand how to test Kotlin coroutines. Next, you'll learn about Kotlin flows on Android, and have a closer look at using Kotlin flows by getting to grips with handling flow cancellations and exceptions and testing the flows.By the end of this book, you'll have the skills you need to build high-quality and maintainable Android applications using coroutines and flows.

951

EBOOK

Simplifying Data Engineering and Analytics with Delta. Create analytics-ready data that fuels artificial intelligence and business intelligence

Anindita Mahapatra

Delta helps you generate reliable insights at scale and simplifies architecture around data pipelines, allowing you to focus primarily on refining the use cases being worked on. This is especially important when you consider that existing architecture is frequently reused for new use cases.In this book, you’ll learn about the principles of distributed computing, data modeling techniques, and big data design patterns and templates that help solve end-to-end data flow problems for common scenarios and are reusable across use cases and industry verticals. You’ll also learn how to recover from errors and the best practices around handling structured, semi-structured, and unstructured data using Delta. After that, you’ll get to grips with features such as ACID transactions on big data, disciplined schema evolution, time travel to help rewind a dataset to a different time or version, and unified batch and streaming capabilities that will help you build agile and robust data products.By the end of this Delta book, you’ll be able to use Delta as the foundational block for creating analytics-ready data that fuels all AI/BI use cases.

952

EBOOK

Simulation for Data Science with R. Effective Data-driven Decision Making

Matthias Templ

Data Science with R aims to teach you how to begin performing data science tasks by taking advantage of Rs powerful ecosystem of packages. R being the most widely used programming language when used with data science can be a powerful combination to solve complexities involved with varied data sets in the real world.The book will provide a computational and methodological framework for statistical simulation to the users. Through this book, you will get in grips with the software environment R. After getting to know the background of popular methods in the area of computational statistics, you will see some applications in R to better understand the methods as well as gaining experience of working with real-world data and real-world problems. This book helps uncover the large-scale patterns in complex systems where interdependencies and variation are critical. An effective simulation is driven by data generating processes that accurately reflect real physical populations. You will learn how to plan and structure a simulation project to aid in the decision-making process as well as the presentation of results.By the end of this book, you reader will get in touch with the software environment R. After getting background on popular methods in the area, you will see applications in R to better understand the methods as well as to gain experience when working on real-world data and real-world problems.

953

EBOOK

Skazany na sukces. Kariera w Data Science

Jacqueline Nolis, Emily Robinson

Nauka o danych, zwana danologią, zyskuje na znaczeniu. Dane dla gospodarki są tym, czym dotąd były węgiel, stal i ropa naftowa. Umiejętność korzystania z wiedzy zawartej w danych decyduje o efektywności prowadzenia działalności gospodarczej i determinuje rozwój nowych modeli, rozwiązań i relacji gospodarczych. Już teraz specjaliści danolodzy są rozchwytywani na rynku pracy. Aby jednak w pełni i do końca wykorzystać pojawiające się możliwości, trzeba wiedzieć, w jaki sposób podejść do trudnego zagadnienia, jakim jest budowanie ścieżki kariery i podążanie nią w odpowiednim dla siebie tempie. To praktyczny przewodnik, dzięki któremu łatwiej zdobędziesz pierwszą pracę związaną z badaniem danych, szybciej staniesz się cenionym specjalistą i w miarę rozwoju zawodowego będziesz coraz trafniej wychwytywać pojawiające się możliwości awansu i zmiany pracy na atrakcyjniejszą. Dowiesz się, jak zdobyć podstawowe umiejętności i jak faktycznie wyglądają konkretne stanowiska pracy. Opisano tu również, jak pomyślnie przejść przez proces rekrutacji i zaaklimatyzować się w nowych warunkach. Nie zabrakło cennych wskazówek dotyczących awansowania na stanowiska kierownicze. Jako danolog prędko się przekonasz, że zawarta tutaj wiedza nietechniczna jest bardzo potrzebna do osiągnięcia sukcesu na polu badania danych. Dzięki tej książce dowiesz się, jak: tworzyć świetne portfolio projektów z zakresu badania danych wyszukiwać, oceniać i negocjować oferty z klasą zmieniać miejsca pracy wybierać i skutecznie realizować scenariusze kariery poradzili sobie inni wybitni analitycy danych! Danologia: nauka, pasja i sposób na życie!

954

EBOOK

Slaying Excel Dragons. A Beginner's Guide to Conquering Excel's Frustrations and Making Excel Fun

MrExcel's Holy Macro! Books, Mike Girvin

This comprehensive guide is designed to elevate your Excel skills from beginner to advanced. Starting with the fundamentals, you'll learn how to navigate Excel's interface, use essential keyboard shortcuts, and manage data efficiently. As you progress, you'll dive into complex features like PivotTables, dynamic ranges, and advanced formatting, gaining the ability to handle intricate data tasks with ease.The guide also covers powerful formulas and functions, including VLOOKUP, INDEX/MATCH, and logical tests. These tools will empower you to automate calculations, perform detailed analyses, and streamline your workflow. Additionally, you'll explore Excel’s data analysis features, such as sorting, filtering, and creating dynamic charts, enabling you to present your data clearly and effectively.By the end of this book, you'll have a deep understanding of Excel's capabilities, equipped with the skills to tackle any spreadsheet challenge. Whether you're preparing for advanced data analysis or seeking to optimize your day-to-day tasks, this guide provides the knowledge and practical experience to make Excel work for you.

955

EBOOK

Smart Internet of Things Projects. Discover how to build your own smart Internet of Things projects and bring a new degree of interconnectivity to your world

Agus Kurniawan

Internet of Things (IoT) is a groundbreaking technology that involves connecting numerous physical devices to the Internet and controlling them. Creating basic IoT projects is common, but imagine building smart IoT projects that can extract data from physical devices, thereby making decisions by themselves.Our book overcomes the challenge of analyzing data from physical devices and accomplishes all that your imagination can dream up by teaching you how to build smart IoT projects. Basic statistics and various applied algorithms in data science and machine learning are introduced to accelerate your knowledge of how to integrate a decision system into a physical device.This book contains IoT projects such as building a smart temperature controller, creating your own vision machine project, building an autonomous mobile robot car, controlling IoT projects through voice commands, building IoT applications utilizing cloud technology and data science, and many more. We will also leverage a small yet powerful IoT chip, Raspberry Pi with Arduino, in order to integrate a smart decision-making system in the IoT projects.

956

EBOOK

Smarter Decisions - The Intersection of Internet of Things and Decision Science. A comprehensive guide for solving IoT business problems using decision science

Jojo Moolayil

With an increasing number of devices getting connected to the Internet, massive amounts of data are being generated that can be used for analysis. This book helps you to understand Internet of Things in depth and decision science, and solve business use cases. With IoT, the frequency and impact of the problem is huge. Addressing a problem with such a huge impact requires a very structured approach. The entire journey of addressing the problem by defining it, designing the solution, and executing it using decision science is articulated in this book through engaging and easy-to-understand business use cases. You will get a detailed understanding of IoT, decision science, and the art of solving a business problem in IoT through decision science. By the end of this book, you’ll have an understanding of the complex aspects of decision making in IoT and will be able to take that knowledge with you onto whatever project calls for it

957

EBOOK

Software Architecture Patterns for Serverless Systems. Architecting for innovation with events, autonomous services, and micro frontends

John Gilbert

As businesses are undergoing a digital transformation to keep up with competition, it is now more important than ever for IT professionals to design systems to keep up with the rate of change while maintaining stability.This book takes you through the architectural patterns that power enterprise-grade software systems and the key architectural elements that enable change (such as events, autonomous services, and micro frontends), along with showing you how to implement and operate anti-fragile systems.First, you’ll divide up a system and define boundaries so that your teams can work autonomously and accelerate innovation. You’ll cover low-level event and data patterns that support the entire architecture, while getting up and running with the different autonomous service design patterns.Next, the book will focus on best practices for security, reliability, testability, observability, and performance. You’ll combine all that you've learned and build upon that foundation, exploring the methodologies of continuous experimentation, deployment, and delivery before delving into some final thoughts on how to start making progress.By the end of this book, you'll be able to architect your own event-driven, serverless systems that are ready to adapt and change so that you can deliver value at the pace needed by your business.

958

EBOOK

Solidity Programming Essentials. A beginner's guide to build smart contracts for Ethereum and blockchain

Ritesh Modi

Solidity is a contract-oriented language whose syntax is highly influenced by JavaScript, and is designed to compile code for the Ethereum Virtual Machine. Solidity Programming Essentials will be your guide to understanding Solidity programming to build smart contracts for Ethereum and blockchain from ground-up.We begin with a brief run-through of blockchain, Ethereum, and their most important concepts or components. You will learn how to install all the necessary tools to write, test, and debug Solidity contracts on Ethereum. Then, you will explore the layout of a Solidity source file and work with the different data types. The next set of recipes will help you work with operators, control structures, and data structures while building your smart contracts. We take you through function calls, return types, function modifers, and recipes in object-oriented programming with Solidity. Learn all you can on event logging and exception handling, as well as testing and debugging smart contracts.By the end of this book, you will be able to write, deploy, and test smart contracts in Ethereum. This book will bring forth the essence of writing contracts using Solidity and also help you develop Solidity skills in no time.

959

EBOOK

Spark. Błyskawiczna analiza danych. Wydanie II

Jules S. Damji, Brooke Wenig, Tathagata Das,...

Apache Spark jest oprogramowaniem open source, przeznaczonym do klastrowego przetwarzania danych dostarczanych w różnych formatach. Pozwala na uzyskanie niespotykanej wydajności, umożliwia też pracę w trybie wsadowym i strumieniowym. Framework ten jest również świetnie przygotowany do uruchamiania złożonych aplikacji, włączając w to algorytmy uczenia maszynowego czy analizy predykcyjnej. To wszystko sprawia, że Apache Spark stanowi znakomity wybór dla programistów zajmujących się big data, a także eksploracją i analizą danych. To książka przeznaczona dla inżynierów danych i programistów, którzy chcą za pomocą Sparka przeprowadzać skomplikowane analizy danych i korzystać z algorytmów uczenia maszynowego, nawet jeśli te dane pochodzą z różnych źródeł. Wyjaśniono tu, jak dzięki Apache Spark można odczytywać i ujednolicać duże zbiory informacji, aby powstawały niezawodne jeziora danych, w jaki sposób wykonuje się interaktywne zapytania SQL, a także jak tworzy się potoki przy użyciu MLlib i wdraża modele za pomocą biblioteki MLflow. Omówiono również współdziałanie aplikacji Sparka z jego rozproszonymi komponentami i tryby jej wdrażania w poszczególnych środowiskach. W książce: API strukturalne dla Pythona, SQL, Scali i Javy operacje Sparka i silnika SQL konfiguracje Sparka i interfejs Spark UI nawiązywanie połączeń ze źródłami danych: JSON, Parquet, CSV, Avro, ORC, Hive, S3 i Kafka operacje analityczne na danych wsadowych i strumieniowanych niezawodne potoki danych i potoki uczenia maszynowego Spark: twórz skalowalne i niezawodne aplikacje big data!

960

EBOOK

Spark. Błyskawiczna analiza danych. Wydanie II

Jules S. Damji, Brooke Wenig, Tathagata Das,...

Apache Spark jest oprogramowaniem open source, przeznaczonym do klastrowego przetwarzania danych dostarczanych w różnych formatach. Pozwala na uzyskanie niespotykanej wydajności, umożliwia też pracę w trybie wsadowym i strumieniowym. Framework ten jest również świetnie przygotowany do uruchamiania złożonych aplikacji, włączając w to algorytmy uczenia maszynowego czy analizy predykcyjnej. To wszystko sprawia, że Apache Spark stanowi znakomity wybór dla programistów zajmujących się big data, a także eksploracją i analizą danych. To książka przeznaczona dla inżynierów danych i programistów, którzy chcą za pomocą Sparka przeprowadzać skomplikowane analizy danych i korzystać z algorytmów uczenia maszynowego, nawet jeśli te dane pochodzą z różnych źródeł. Wyjaśniono tu, jak dzięki Apache Spark można odczytywać i ujednolicać duże zbiory informacji, aby powstawały niezawodne jeziora danych, w jaki sposób wykonuje się interaktywne zapytania SQL, a także jak tworzy się potoki przy użyciu MLlib i wdraża modele za pomocą biblioteki MLflow. Omówiono również współdziałanie aplikacji Sparka z jego rozproszonymi komponentami i tryby jej wdrażania w poszczególnych środowiskach. W książce: API strukturalne dla Pythona, SQL, Scali i Javy operacje Sparka i silnika SQL konfiguracje Sparka i interfejs Spark UI nawiązywanie połączeń ze źródłami danych: JSON, Parquet, CSV, Avro, ORC, Hive, S3 i Kafka operacje analityczne na danych wsadowych i strumieniowanych niezawodne potoki danych i potoki uczenia maszynowego Spark: twórz skalowalne i niezawodne aplikacje big data!

961

EBOOK

Spark Cookbook. With over 60 recipes on Spark, covering Spark Core, Spark SQL, Spark Streaming, MLlib, and GraphX libraries this is the perfect Spark book to always have by your side

Rishi Yadav

If you are a data engineer, an application developer, or a data scientist who would like to leverage the power of Apache Spark to get better insights from big data, then this is the book for you.