Big data

1

Ebook

100 sposobów na Excel 2007 PL. Tworzenie funkcjonalnych arkuszy

Wykorzystaj wszystkie możliwości Excela, aby tworzyć funkcjonalne i efektowne arkusze Jak analizować dane i zarządzać nimi? Jak optymalnie wykorzystywać tabele przestawne? Jak tworzyć spersonalizowane wykresy? Większość użytkowników Excela zna tylko drobną część dostępnych w nim możliwości. Istnieje jednak wiele metod pozwalających znacząco poszerzyć wydajność tego programu i tworzyć efektowne arkusze kalkulacyjne bez potrzeby czasochłonnego zdobywania wiedzy. Nowa wersja Excela umożliwia chociażby łatwiejsze korzystanie z mechanizmu tabel przestawnych, formatowania warunkowego i nazw zakresów, podglądu "na żywo", galerii predefiniowanych stylów oraz grafik SmartArt. Wszystko to sprawia, że każdy może dziś w swoich arkuszach używać bardziej skomplikowanych elementów wizualnych i graficznych. Obecnie, kiedy wszystko dzieje się coraz szybciej, czas staje się jedną z najistotniejszych i pożądanych wartości. Książka "100 sposobów na Excel 2007 PL. Tworzenie funkcjonalnych arkuszy" pozwala Ci właśnie zyskać na czasie - oferuje ponad sto gotowych metod tworzenia funkcjonalnych i efektownych arkuszy, szybkich i niezawodnych rozwiązań skomplikowanych problemów. Warto także wykorzystać te metody do przeanalizowania oraz zastosowania niektórych możliwości języka Visual Basic for Applications (VBA) - dzięki temu będziesz mógł zaadaptować wszystkie zawarte tu propozycje do własnych potrzeb. Skoroszyty i arkusze Wbudowane mechanizmy do analizy danych i zarządzania nimi Metody tworzenia nazw i zakresów komórek Tabele przestawne Formuły i funkcje Wykresy i makra Współpraca Excela z innymi aplikacjami pakietu Office Wszystko, co chciałbyś widzieć o Excelu, aby natychmiast z niego skorzystać. Poznaj ponad setkę sposobów skutecznego radzenia sobie z Excelem!

2

Ebook

15 Math Concepts Every Data Scientist Should Know. Understand and learn how to apply the math behind data science algorithms

David Hoyle

Data science combines the power of data with the rigor of scientific methodology, with mathematics providing the tools and frameworks for analysis, algorithm development, and deriving insights. As machine learning algorithms become increasingly complex, a solid grounding in math is crucial for data scientists. David Hoyle, with over 30 years of experience in statistical and mathematical modeling, brings unparalleled industrial expertise to this book, drawing from his work in building predictive models for the world's largest retailers. Encompassing 15 crucial concepts, this book covers a spectrum of mathematical techniques to help you understand a vast range of data science algorithms and applications. Starting with essential foundational concepts, such as random variables and probability distributions, you’ll learn why data varies, and explore matrices and linear algebra to transform that data. Building upon this foundation, the book spans general intermediate concepts, such as model complexity and network analysis, as well as advanced concepts such as kernel-based learning and information theory. Each concept is illustrated with Python code snippets demonstrating their practical application to solve problems.By the end of the book, you’ll have the confidence to apply key mathematical concepts to your data science challenges.

3

Ebook

Access. Analiza danych. Receptury

Wayne Freeze, Ken Bluttman

Korzystaj z bazy danych Access jak profesjonalista! Jak stosować wskaźniki statystyczne do analizy danych biznesowych? Jak rozszerzać funkcjonalność zapytań SQL, stosując skrypty VBA? Jak przetwarzać dane i przenosić je między bazami Access? Access to znane już narzędzie służące do wszechstronnego przetwarzania i analizy danych. Posiada sporo ukrytych mechanizmów, pozwalających efektywnie wykonywać zadania, które początkowo mogą wydawać się skomplikowane. Książka przedstawia przykłady kwerend, metody przenoszenia danych pomiędzy bazami Access, obliczania wielu wskaźników finansowo-biznesowych i sporo innych zagadnień - wszystko pod kątem analizy i przetwarzania danych. Każda zaprezentowana receptura jest opatrzona kompletnym opisem rozwiązania problemu wraz ze szczegółowym omówieniem metody postępowania oraz analizą kodu. Access. Analiza danych. Receptury to uniwersalny podręcznik przeznaczony zarówno dla początkujących użytkowników bazy danych Access, jak i doświadczonych. Dzięki przejrzystemu językowi i mnogości poruszonych zagadnień każdy, niezależnie od stopnia zaawansowania, może poszerzyć swoją wiedzę. Zawiera mnóstwo ciekawych wskazówek i technik ułatwiających codzienną pracę z bazami danych, co czyni ją atrakcyjną nawet dla osób doskonale posługujących się bazą Access. Jest to także kompendium wiedzy niezbędnej każdemu, kto chce wyciągać ze zbiorów danych naprawdę cenne informacje. Tworzenie kwerend różnych typów Wstawianie, aktualizacja i usuwanie danych Przetwarzanie tekstu i liczb zapisanych w formie łańcucha znaków Zastosowanie tabel, modyfikacja zawartości systemu Windows, szyfrowanie danych Wykorzystanie obiektu FileSystemObject, przetwarzanie danych XML oraz XSLT, komunikacja z bazami SQL Rozwiązywanie problemów biznesowych Obliczanie wskaźników statystycznych Baza danych to fundament biznesu - zobacz, jak efektywnie nią zarządzać!

4

Ebook

Advanced Deep Learning with Keras. Apply deep learning techniques, autoencoders, GANs, variational autoencoders, deep reinforcement learning, policy gradients, and more

Rowel Atienza

Recent developments in deep learning, including Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and Deep Reinforcement Learning (DRL) are creating impressive AI results in our news headlines - such as AlphaGo Zero beating world chess champions, and generative AI that can create art paintings that sell for over $400k because they are so human-like.Advanced Deep Learning with Keras is a comprehensive guide to the advanced deep learning techniques available today, so you can create your own cutting-edge AI. Using Keras as an open-source deep learning library, you'll find hands-on projects throughout that show you how to create more effective AI with the latest techniques.The journey begins with an overview of MLPs, CNNs, and RNNs, which are the building blocks for the more advanced techniques in the book. You’ll learn how to implement deep learning models with Keras and TensorFlow 1.x, and move forwards to advanced techniques, as you explore deep neural network architectures, including ResNet and DenseNet, and how to create autoencoders. You then learn all about GANs, and how they can open new levels of AI performance. Next, you’ll get up to speed with how VAEs are implemented, and you’ll see how GANs and VAEs have the generative power to synthesize data that can be extremely convincing to humans - a major stride forward for modern AI. To complete this set of advanced techniques, you'll learn how to implement DRL such as Deep Q-Learning and Policy Gradient Methods, which are critical to many modern results in AI.

5

Ebook

Algorithms and Data Structures with Python. A comprehensive guide to data structures & algorithms via an interactive learning experience

Cuantum Technologies LLC

Begin your journey with an introduction to Python and algorithms, laying the groundwork for more complex topics. You will start with the basics of Python programming, ensuring a solid foundation before diving into more advanced and sophisticated concepts. As you progress, you'll explore elementary data containers, gaining an understanding of their role in algorithm development.Midway through the course, you’ll delve into the art of sorting and searching, mastering techniques that are crucial for efficient data handling. You will then venture into hierarchical data structures, such as trees and graphs, which are essential for understanding complex data relationships. By mastering algorithmic techniques, you’ll learn how to implement solutions for a variety of computational challenges.The latter part of the course focuses on advanced topics, including network algorithms, string and pattern deciphering, and advanced computational problems. You'll apply your knowledge through practical case studies and optimizations, bridging the gap between theoretical concepts and real-world applications. This comprehensive approach ensures you are well-prepared to handle any programming challenge with confidence.

6

Ebook

Amazon Redshift Cookbook. Recipes for building modern data warehousing solutions - Second Edition

Shruti Worlikar, Harshida Patel, Anusha Challa, Ippokratis Pandis

Amazon Redshift Cookbook offers comprehensive guidance for leveraging AWS's fully managed cloud data warehousing service. Whether you're building new data warehouse workloads or migrating traditional on-premises platforms to the cloud, this essential resource delivers proven implementation strategies. Written by AWS specialists, these easy-to-follow recipes will equip you with the knowledge to successfully implement Amazon Redshift-based data analytics solutions using established best practices.The book focuses on Redshift architecture, showing you how to perform database administration tasks on Redshift. You'll learn how to optimize your data warehouse to quickly execute complex analytic queries against very large datasets. The book covers recipes to help you take full advantage of Redshift's columnar architecture and managed services. You'll discover how to deploy fully automated and highly scalable extract, transform, and load (ETL) processes, helping you minimize the operational effort that you invest in managing regular ETL pipelines and ensuring the timely and accurate refreshing of your data warehouse.By the end of this Redshift book, you'll be able to implement a Redshift-based data analytics solution by adopting best-practice approaches for solving commonly faced problems.

7

Ebook

Analiza i prezentacja danych w Microsoft Excel. Vademecum Walkenbacha

John Walkenbach, Michael Alexander

Wykorzystaj możliwości Excela w zarządzaniu! Co to są pulpity menedżerskie? Jak efektownie zaprezentować najważniejsze informacje? Jak zautomatyzować tworzenie raportów? Excel to niezastąpione narzędzie, jeżeli musisz przetworzyć setki, tysiące, a nawet miliony danych. Od wielu lat obecny na rynku, program ten dzięki intuicyjnemu interfejsowi użytkownika, ogromnym możliwościom oraz rozsądnej cenie zdobył popularność w zasadzie w każdym środowisku - począwszy od akademickiego, a skończywszy na menedżerach i prezesach. Ta książka przeznaczona jest dla tej drugiej grupy. Ciągły napływ nowych informacji w świecie biznesu sprawia, że są one trudne do ogarnięcia. Z pomocą przychodzą pulpity menedżerskie! Dzięki tej książce dowiesz się, jak je stworzyć oraz wyłuskać najistotniejsze informacje z morza danych. W trakcie lektury nauczysz się analizować i przedstawiać w użytecznej formie dostępne dane, korzystać z metod szybkiej prezentacji oraz automatyzować procesy raportowania i tworzyć przyciągające wzrok prezentacje. Ponadto biegle opanujesz korzystanie z tabel i wykresów przestawnych oraz tworzenie zaawansowanych komponentów do prezentacji tendencji czy oceny efektywności realizacji celów. Jest to idealna książka dla każdego menedżera tonącego w gąszczu danych! Definicja pulpitów menedżerskich Określanie wymagań użytkowników Zasady projektowania pulpitów menedżerskich Projektowanie modelu danych Wykresy w programie Microsoft Excel Wykorzystanie tabel przestawnych Tworzenie wykresów przestawnych Wykresy przebiegu w czasie Inne techniki wizualizacji danych Tworzenie komponentów do prezentacji tendencji i grupowania danych Prezentacja efektywności realizacji celów Wykorzystanie makr w raportach Dodawanie interaktywnych kontrolek do pulpitu menedżerskiego Importowanie danych z Microsoft Access Metody bezpiecznego współdzielenia danych Uporządkuj i efektownie zaprezentuj najważniejsze informacje!

8

Ebook

Analiza i prezentacja danych w Microsoft Excel. Vademecum Walkenbacha. Wydanie II

John Walkenbach, Michael Alexander

Wykorzystaj możliwości Excela w zarządzaniu! Jeżeli masz przed sobą setki, a może tysiące lub miliony danych, z których chcesz wyciągnąć celne wnioski, potrzebujesz narzędzia, które pomoże Ci to ogarnąć. Mowa oczywiście o Excelu. Nieważne, kim jesteś - studentem, księgowym, menedżerem czy prezesem - na 100% docenisz drzemiący w nim potencjał! Dzięki tej książce dowiesz się, jak wyłuskać najistotniejsze informacje z morza danych. W trakcie lektury nauczysz się błyskawicznie przygotowywać raporty oraz prezentacje. Przekonasz się, że tabele przestawne wcale nie muszą być takie straszne, oraz zobaczysz najlepsze techniki prezentacji tendencji czy oceny efektywności w realizacji celów. Kolejne wydanie książki zostało zaktualizowane, ulepszone i rozszerzone o mnóstwo nowych, przydatnych wiadomości. Dowiesz się, jak importować dane z bazy SQL Server oraz jak wykorzystać możliwości dodatku Power View. Książka ta jest idealną pozycją dla tonących w gąszczu danych! Dzięki tej książce: poznasz narzędzia Excela w zakresie analizy i prezentacji danych opanujesz najlepsze techniki projektowania tabel przygotujesz czytelne raporty wykorzystasz w pełni możliwości Excela Uratuj się z morza danych!

9

Ebook

Analytics for the Internet of Things (IoT). Intelligent analytics for your intelligent devices

Andrew Minteer

We start with the perplexing task of extracting value from huge amounts of barely intelligible data. The data takes a convoluted route just to be on the servers for analysis, but insights can emerge through visualization and statistical modeling techniques. You will learn to extract value from IoT big data using multiple analytic techniques. Next we review how IoT devices generate data and how the information travels over networks. You’ll get to know strategies to collect and store the data to optimize the potential for analytics, and strategies to handle data quality concerns. Cloud resources are a great match for IoT analytics, so Amazon Web Services, Microsoft Azure, and PTC ThingWorx are reviewed in detail next. Geospatial analytics is then introduced as a way to leverage location information. Combining IoT data with environmental data is also discussed as a way to enhance predictive capability. We’ll also review the economics of IoT analytics and you’ll discover ways to optimize business value.By the end of the book, you’ll know how to handle scale for both data storage and analytics, how Apache Spark can be leveraged to handle scalability, and how R and Python can be used for analytic modeling.

10

Ebook

Apache Hadoop 3 Quick Start Guide. Learn about big data processing and analytics

Hrishikesh Vijay Karambelkar

Apache Hadoop is a widely used distributed data platform. It enables large datasets to be efficiently processed instead of using one large computer to store and process the data. This book will get you started with the Hadoop ecosystem, and introduce you to the main technical topics, including MapReduce, YARN, and HDFS.The book begins with an overview of big data and Apache Hadoop. Then, you will set up a pseudo Hadoop development environment and a multi-node enterprise Hadoop cluster. You will see how the parallel programming paradigm, such as MapReduce, can solve many complex data processing problems.The book also covers the important aspects of the big data software development lifecycle, including quality assurance and control, performance, administration, and monitoring. You will then learn about the Hadoop ecosystem, and tools such as Kafka, Sqoop, Flume, Pig, Hive, and HBase. Finally, you will look at advanced topics, including real time streaming using Apache Storm, and data analytics using Apache Spark. By the end of the book, you will be well versed with different configurations of the Hadoop 3 cluster.

11

Ebook

Apache Ignite Quick Start Guide. Distributed data caching and processing made easy

Sujoy Acharya

Apache Ignite is a distributed in-memory platform designed to scale and process large volume of data. It can be integrated with microservices as well as monolithic systems, and can be used as a scalable, highly available and performant deployment platform for microservices. This book will teach you to use Apache Ignite for building a high-performance, scalable, highly available system architecture with data integrity.The book takes you through the basics of Apache Ignite and in-memory technologies. You will learn about installation and clustering Ignite nodes, caching topologies, and various caching strategies, such as cache aside, read and write through, and write behind. Next, you will delve into detailed aspects of Ignite’s data grid: web session clustering and querying data.You will learn how to process large volumes of data using compute grid and Ignite’s map-reduce and executor service. You will learn about the memory architecture of Apache Ignite and monitoring memory and caches. You will use Ignite for complex event processing, event streaming, and the time-series predictions of opportunities and threats. Additionally, you will go through off-heap and on-heap caching, swapping, and native and Spring framework integration with Apache Ignite.By the end of this book, you will be confident with all the features of Apache Ignite 2.x that can be used to build a high-performance system architecture.

12

Ebook

Apache Kafka 1.0 Cookbook. Over 100 practical recipes on using distributed enterprise messaging to handle real-time data

Raúl Estrada

Apache Kafka provides a unified, high-throughput, low-latency platform to handle real-time data feeds. This book will show you how to use Kafka efficiently, and contains practical solutions to the common problems that developers and administrators usually face while working with it. This practical guide contains easy-to-follow recipes to help you set up, configure, and use Apache Kafka in the best possible manner. You will use Apache Kafka Consumers and Producers to build effective real-time streaming applications. The book covers the recently released Kafka version 1.0, the Confluent Platform and Kafka Streams. The programming aspect covered in the book will teach you how to perform important tasks such as message validation, enrichment and composition.Recipes focusing on optimizing the performance of your Kafka cluster, and integrate Kafka with a variety of third-party tools such as Apache Hadoop, Apache Spark, and Elasticsearch will help ease your day to day collaboration with Kafka greatly. Finally, we cover tasks related to monitoring and securing your Apache Kafka cluster using tools such as Ganglia and Graphite.If you're looking to become the go-to person in your organization when it comes to working with Apache Kafka, this book is the only resource you need to have.

13

Ebook

Apache Kafka Quick Start Guide. Leverage Apache Kafka 2.0 to simplify real-time data processing for distributed applications

Raúl Estrada

Apache Kafka is a great open source platform for handling your real-time data pipeline to ensure high-speed filtering and pattern matching on the ?y. In this book, you will learn how to use Apache Kafka for efficient processing of distributed applications and will get familiar with solving everyday problems in fast data and processing pipelines.This book focuses on programming rather than the configuration management of Kafka clusters or DevOps. It starts off with the installation and setting up the development environment, before quickly moving on to performing fundamental messaging operations such as validation and enrichment.Here you will learn about message composition with pure Kafka API and Kafka Streams. You will look into the transformation of messages in different formats, such asext, binary, XML, JSON, and AVRO. Next, you will learn how to expose the schemas contained in Kafka with the Schema Registry. You will then learn how to work with all relevant connectors with Kafka Connect. While working with Kafka Streams, you will perform various interesting operations on streams, such as windowing, joins, and aggregations. Finally, through KSQL, you will learn how to retrieve, insert, modify, and delete data streams, and how to manipulate watermarks and windows.

14

Ebook

Apache Spark 2: Data Processing and Real-Time Analytics. Master complex big data processing, stream analytics, and machine learning with Apache Spark

Romeo Kienzler, Md. Rezaul Karim, Sridhar Alla, Siamak Amirghodsi, ...

Apache Spark is an in-memory, cluster-based data processing system that provides a wide range of functionalities such as big data processing, analytics, machine learning, and more. With this Learning Path, you can take your knowledge of Apache Spark to the next level by learning how to expand Spark's functionality and building your own data flow and machine learning programs on this platform.You will work with the different modules in Apache Spark, such as interactive querying with Spark SQL, using DataFrames and datasets, implementing streaming analytics with Spark Streaming, and applying machine learning and deep learning techniques on Spark using MLlib and various external tools.By the end of this elaborately designed Learning Path, you will have all the knowledge you need to master Apache Spark, and build your own big data processing and analytics pipeline quickly and without any hassle.This Learning Path includes content from the following Packt products:• Mastering Apache Spark 2.x by Romeo Kienzler• Scala and Spark for Big Data Analytics by Md. Rezaul Karim, Sridhar Alla• Apache Spark 2.x Machine Learning Cookbook by Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, Shuen MeiCookbook

15

Ebook

Apache Spark 2.x Cookbook. Over 70 cloud-ready recipes for distributed Big Data processing and analytics

Rishi Yadav

While Apache Spark 1.x gained a lot of traction and adoption in the early years, Spark 2.x delivers notable improvements in the areas of API, schema awareness, Performance, Structured Streaming, and simplifying building blocks to build better, faster, smarter, and more accessible big data applications. This book uncovers all these features in the form of structured recipes to analyze and mature large and complex sets of data.Starting with installing and configuring Apache Spark with various cluster managers, you will learn to set up development environments. Further on, you will be introduced to working with RDDs, DataFrames and Datasets to operate on schema aware data, and real-time streaming with various sources such as Twitter Stream and Apache Kafka. You will also work through recipes on machine learning, including supervised learning, unsupervised learning & recommendation engines in Spark.Last but not least, the final few chapters delve deeper into the concepts of graph processing using GraphX, securing your implementations, cluster optimization, and troubleshooting.

16

Ebook

Apache Spark 2.x for Java Developers. Explore big data at scale using Apache Spark 2.x Java APIs

Sourav Gulati, Sumit Kumar

Apache Spark is the buzzword in the big data industry right now, especially with the increasing need for real-time streaming and data processing. While Spark is built on Scala, the Spark Java API exposes all the Spark features available in the Scala version for Java developers. This book will show you how you can implement various functionalities of the Apache Spark framework in Java, without stepping out of your comfort zone.The book starts with an introduction to the Apache Spark 2.x ecosystem, followed by explaining how to install and configure Spark, and refreshes the Java concepts that will be useful to you when consuming Apache Spark's APIs. You will explore RDD and its associated common Action and Transformation Java APIs, set up a production-like clustered environment, and work with Spark SQL. Moving on, you will perform near-real-time processing with Spark streaming, Machine Learning analytics with Spark MLlib, and graph processing with GraphX, all using various Java packages.By the end of the book, you will have a solid foundation in implementing components in the Spark framework in Java to build fast, real-time applications.

Categories