This website uses cookies
We use cookies to continuously improve your experience on our site. More info.
Data Science |
|
A decision support tool that uses a tree-like model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm that only contains conditional control statements. |
|
A branch of machine learning based on a specific set of algorithms. These algorithms are called artificial neural networks and were designed to mimic a human brain’s structure and function. The algorithms can learn different levels of representation (abstraction) through classification and pattern analysis, among other methods. |
|
erwin Data Modeler is a data modeling tool used to find, visualize, design, deploy and standardize high-quality enterprise data assets. |
|
esProc SPL (Structured Procedure Language) is a high-performance scripting language designed for data computation and analysis. It provides a set of powerful functions and capabilities for manipulating structured data, making it well-suited for tasks such as data cleaning, transformation, and analytics. |
|
Federated Learning (FL) is a machine learning approach that allows a model to be trained across multiple decentralized devices or servers holding local data samples, without exchanging them centrally. This technique enables privacy-preserving model updates and is often used in applications where data privacy and security are critical, such as healthcare and mobile devices. |
|
GAN, short for Generative Adversarial Network, is a type of artificial neural network framework used in machine learning and generative modeling. It consists of two neural networks, a generator, and a discriminator, which are trained together through a competitive process to create and evaluate data, often used for tasks like image generation. |
|
An open-source library for unsupervised topic modeling and natural language processing, using modern statistical machine learning. |
|
A data visualization package for the statistical programming language R. ggplot2 can serve as a replacement for the base graphics in R and contains a number of defaults for web and print display of common scales. |
|
Open source deep learning interface which allows developers to build machine learning models. |
|
GPT, or Generative Pre-trained Transformer, is an advanced type of artificial intelligence model used for natural language processing. It's designed to generate human-like text and has found applications in tasks like chatbots, language translation, content generation, and more. |
|
Graph technologies refer to a set of tools, methods, and frameworks for managing and analyzing data in graph structures, which consist of nodes and edges. They are particularly useful for modeling and exploring complex relationships in various domains, from social networks to recommendation systems. |
|
An open-source software framework that is used for distributed storage and processing of big data sets across clusters of computers using simple programming models; the Apache project. |
|
A distributed, versioned, non-relational database modeled after Google's Bigtable. It is built on top of HDFS and allows to perform read/write operations on large datasets in real time using Key/Value data. The programming language of HBase is Java. Today HBase is an integral part of the Apache Software Foundation and the Hadoop ecosystem. |
|
Hadoop Distributed File System, HDFS for short, is a Java-based distributed file system that allows to store large data sets (files which are in the range of terabytes and petabytes) reliably. HDFS is highly fault-tolerant and is designed to be deployed on low-cost hardware. It is the primary storage used by Hadoop applications. |
|
Hortonworks Data Platform is a secure, enterprise-ready open source Hadoop distribution based on a centralized architecture (YARN). HDP enables enterprises to deploy, integrate and work with unprecedented volumes of structured and unstructured data. |
|
Tableau's in-memory data engine technology, designed for fast data ingest and analytical query processing on large or complex data sets. |
|
The application of a set of techniques and algorithms to a digital image to analyze, enhance, or optimize image characteristics such as sharpness and contrast. |
|
A modern, massively distributed SQL query engine for Apache Hadoop. It allows you to analyze, transform and combine data from a variety of data sources. With Impala, you can query data, whether stored in HDFS or HBase, in real time. |
|
A functional language designed for processing data and JSON queries on big data. It is suitable for any volume of data, both structured and unstructured. Jaql also works on other data formats, such as XML and CSV, and it is compatible with SQL structured data. |
|
Julia is a high-level, high-performance programming language specifically designed for data science, numerical and scientific computing, and machine learning. It is known for its speed and productivity, making it a popular choice for data scientists and researchers. |
|
A neural networks API. It can run on top of Tensorflow, CNTK or Theano. This library allows you to prototype easy and fast, supports both convolutional networks and recurrent networks and runs seamlessly on CPU and GPU. |
|
Amazon Web Service for processing big data in real time. Enables to get timely information and react quickly to it. Simplifies the process of writing apps that rely on data that must be processed in real time. |
|
KNIME is an open-source data analytics, reporting, and integration platform. It allows users to perform data analysis, manipulation, and transformation through a visual interface, making it a popular choice for data scientists and analysts who want to build and execute machine learning workflows. |
|
A streaming SQL engine for Apache Kafka. Provides interactive SQL interface for stream processing on Kafka. Supports a wide range of streaming operations, including data filtering, transformations, aggregations, joins, windowing, sessionization, and much more. |
|
A distributed analytics engine that provides a SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets. |