Analiza danych
Analiza danych jest ekscytującą dyscypliną, która umożliwia zrozumienie pewnych zjawisk, uzyskanie wglądu i wiedzy na podstawie surowych danych. Pojęcie to oznacza dokładnie przetwarzanie danych za pomocą technik matematycznych i statystycznych w celu uzyskania cennych wniosków, podjęcia ważnych decyzji i opracowania przydatnych produktów. Termin ten wywodzi się od angielskiego data science, często traktowanego jako synonim takich terminów, jak analityka biznesowa, badania operacyjne, business intelligence, wywiad konkurencyjny, analiza i modelowanie danych, a także pozyskiwanie wiedzy. Dzięki takim technologiom, jak języki Python czy R, platformy Hadoop i Spark masz szansę wyciągnąć maksimum wniosków, dostrzec szanse na rozwój swojej organizacji albo przewidzieć i zapobiec zagrożeniom.
Renato Baruti
Alteryx, as a leading data blending and advanced data analytics platform, has taken self-service data analytics to the next level. Companies worldwide often find themselves struggling to prepare and blend massive datasets that are time-consuming for analysts. Alteryx solves these problems with a repeatable workflow designed to quickly clean, prepare, blend, and join your data in a seamless manner. This book will set you on a self-service data analytics journey that will help you create efficient workflows using Alteryx, without any coding involved. It will empower you and your organization to take well-informed decisions with the help of deeper business insights from the data.Starting with the fundamentals of using Alteryx such as data preparation and blending, you will delve into the more advanced concepts such as performing predictive analytics. You will also learn how to use Alteryx’s features to share the insights gained with the relevant decision makers. To ensure consistency, we will be using data from the Healthcare domain throughout this book. The knowledge you gain from this book will guide you to solve real-life problems related to Business Intelligence confidently. Whether you are a novice with Alteryx or an experienced data analyst keen to explore Alteryx’s self-service analytics features, this book will be the perfect companion for you.
Learning Apache Apex. Real-time streaming applications with Apex
Ananth Gundabattula, Thomas Weise, Munagala V. Ramanath,...
Apache Apex is a next-generation stream processing framework designed to operate on data at large scale, with minimum latency, maximum reliability, and strict correctness guarantees.Half of the book consists of Apex applications, showing you key aspects of data processing pipelines such as connectors for sources and sinks, and common data transformations. The other half of the book is evenly split into explaining the Apex framework, and tuning, testing, and scaling Apex applications.Much of our economic world depends on growing streams of data, such as social media feeds, financial records, data from mobile devices, sensors and machines (the Internet of Things - IoT). The projects in the book show how to process such streams to gain valuable, timely, and actionable insights. Traditional use cases, such as ETL, that currently consume a significant chunk of data engineering resources are also covered.The final chapter shows you future possibilities emerging in the streaming space, and how Apache Apex can contribute to it.
Sandeep Yarabarla, Graham Doman
Cassandra is a distributed database that stands out thanks to its robust feature set and intuitive interface, while providing high availability and scalability of a distributed data store. This book will introduce you to the rich feature set offered by Cassandra, and empower you to create and manage a highly scalable, performant and fault-tolerant database layer.The book starts by explaining the new features implemented in Cassandra 3.x and get you set up with Cassandra. Then you’ll walk through data modeling in Cassandra and the rich feature set available to design a flexible schema. Next you’ll learn to create tables with composite partition keys, collections and user-defined types and get to know different methods to avoid denormalization of data. You will then proceed to create user-defined functions and aggregates in Cassandra. Then, you will set up a multi node cluster and see how the dynamics of Cassandra change with it. Finally, you will implement some application-level optimizations using a Java client.By the end of this book, you'll be fully equipped to build powerful, scalable Cassandra database layers for your applications.
Muhammad Asif Abbasi
Apache Spark has seen an unprecedented growth in terms of its adoption over the last few years, mainly because of its speed, diversity and real-time data processing capabilities. It has quickly become the preferred choice of tool for many Big Data professionals looking to find quick insights from large chunks of data. This book introduces you to the Apache Spark framework, and familiarizes you with all the latest features and capabilities introduced in Spark 2.Starting with a detailed introduction to Spark’s architecture and the installation procedure, this book covers everything you need to know about the Spark framework in the most practical manner. You will learn how to perform the basic ETL activities using Spark, and work with different components of Spark such as Spark SQL, as well as the Dataset and DataFrame APIs for manipulating your data. Then, you will perform machine learning using Spark MLlib, as well as perform streaming analytics and graph processing using the Spark Streaming and GraphX modules respectively. The book also gives special emphasis on deploying your Spark models, and how they can be operated in a clustered mode.During the course of the book, you will come across implementations of different real-world use-cases and examples, giving you the hands-on knowledge you need to use Apache Spark in the best possible manner.
Learning ArcGIS Runtime SDK for .NET. Build a GIS app Using ArcGIS Runtime SDK
Ron Vincent
ArcGIS is a geographic information system (GIS) that enables you to work with maps and geographic information. It can be used to create and utilize maps, compile geographic data, analyze mapped information, share and discover geographic information and manage geographic information in a database.This book starts by showing you where ArcGIS Runtime fits within Esri’s overall platform strategy. You'll create an initial map using the SDK, then use it to get an understanding of the MVVM model. You'll find out about the different kinds of layers and start adding layers, and you'll learn to transform maps into a 3D scene. The next chapters will help you comprehend and extract information contained in the maps using co-ordinates and layer objects. Towards the end, you will learn to set the symbology, decide whether to use 2D or 3D, see how to implement 2D or 3D, and learn to search and find objects. You'll also get to grips with many other standard features of the Application Programming Interface (API), including create applications and finally testing, licensing, and deploying them. Once completed, you will be able to meet most of the common requirements of any mapping application for desktop or mobile platforms.
Shiwang Kalkhanda
AWK is one of the most primitive and powerful utilities which exists in all Unix and Unix-like distributions. It is used as a command-line utility when performing a basic text-processing operation, and as programming language when dealing with complex text-processing and mining tasks. With this book, you will have the required expertise to practice advanced AWK programming in real-life examples. The book starts off with an introduction to AWK essentials. You will then be introduced to regular expressions, AWK variables and constants, arrays and AWK functions and more. The book then delves deeper into more complex tasks, such as printing formatted output in AWK, control flow statements, GNU's implementation of AWK covering the advanced features of GNU AWK, such as network communication, debugging, and inter-process communication in the GAWK programming language which is not easily possible with AWK. By the end of this book, the reader will have worked on the practical implementation of text processing and pattern matching using AWK to perform routine tasks.
Learning Couchbase. Design documents and implement real world e-commerce applications with Couchbase
Henry Potsangbam
This book achieves its goal by taking up an end-to-end development structure, right from understanding NOSQL document design to implementing full fledged eCommerce application design using Couchbase as a backend. Starting with the architecture of Couchbase to get you up and running, this book quickly takes you through designing a NoSQL document and implementing highly scalable applications using Java API. You will then be introduced to document design and get to know the various ways to administer Couchbase. Followed by this, learn to store documents using bucket. Moving on, you will then learn to store, retrieve and delete documents using smart client base on Java API. You will then retrieve documents using SQL like syntax call N1QL. Next, you will learn how to write map reduce base views. Finally, you will configure XDCR for disaster recovery and implement an eCommerce application using Couchbase.
Learning D3.js Mapping. Build stunning maps and visualizations using D3.js
Thomas Newton, Oscar Villarreal
If you are interested in creating maps for the web GIS data, this book is for you. Familiarity with D3.js will be helpful but is not necessary.