"What is required is something between where the data is saved and the place where questions are thrown. Both data and questions are infinite," he says.
Why can't a database operator do that?"Each database supplier can only optimize your own stack, so you need a data catalog that plays the role of" Switzerland ", a neutral country."
"Furthermore, in the self -service trends represented by" Tableau ", users must be in the target market (not just IT), and the data remains where it is. This is for us.Extremely important. "
ALATION positions himself as a "trusted catalog for data" and proposes data tagging by machine learning.The company forms a single reference source for the organization based on all the data stores.Customers such as ebay and Godaddy have built a catalog using this technology.EBAY draws data from Teradata's data warehouse, and Godaddy uses Tableau.
ALATION explained the data governance issues that were dealt with the company's software, following the words of EBAY's former highest data Governor Zhehel Kal."If an unspecified person pulls out some data, enter it into" Excel ", add it, write it in" PowerPoint "and take it out here and there, it will be the biggest crime of data governance."
Aaron Curve, Vice President of ALATION's design, explains that the core idea behind the company's data catalog products is "collaborative filtering."
The principle of this "data collection" is helpful in the case of another customer, Munich Re.According to ALATION, Munich RE's highest data Officer Wolfgang Horner said:"The Munich Re data strategy is to provide customers with new and better risk -related services. The core of that strategy is the integrated self -service data analysis platform. Aalation's social catalogs.In part of the platform, more than 600 users in the group have already easily discovered data and helped sharing knowledge with each other. "
Mapd, which is working on visualization of data, originated in the 2010 Arab Spring.During the revolution covering the Middle East, the founder and CEO have developed a prototype of the company's technology that explores big datasets in a two -way way while studying the use of "Twitter" at Harvard University.
After that, he became a research fellow at Massachusetts Institute of Technology and concentrated on research on the GPU database.The GPU can process images faster than the CPU using parallel processing architecture.GPU processors are used for tasks with high resource consumption, such as computer games.
MAPD applied this technology to general -purpose analysis, especially operational analysis, geographical space and data science.
The investors include the US Central Information Bureau (CIA) venture fund IN-Q-TEL, GPU manufacturers NVIDIA, and Verizon.Among the customers, Volkswagen has visualized so -called "black box" AI and machine learning models, and Los Angeles in the United States Geographical Space Properties Visible Organization Pactriglo uses this software to support Los Angeles housing crisis.Say.
"The GPU is not for everyone," says Mostak."Many computing problems are more likely, and GPUs are not suitable for non -structured data, but modern GPUs with thousands of cores are intensive (structured data).It is wonderful in that it can be paralleled in parallel. If you look at the trends of modern hardware, you can see great effects. "
MAPD also participates in the GPU Open Analytics Initiative project, which aims to build a common data framework to accelerate data analysis on GPU.
Aerospike is a Nosql database supplier.The roots of services for ad tech companies that bid to advertising spaces in real time are roots, and the company is deepening its expansion into financial services.The company's database is used for discovering new scam patterns, identifying financial risks for sunrise, and reservations on the Internet.
The company was founded in 2009, and 152 companies use the company's paid services.The relationship with Intel is also deep.The company positions himself as a specialist in non -structured data.Real -time transactions and analysis are possible, and it can handle both NOSQL databases such as "CouchBase", "Cassandra" and "Mongodb" and "Hadoop".The company's database architecture explains that removing the cache will achieve both high -speed and consistency.
According to AEROSPIKE founder and CTO (Chief Technical Officer) Brian Balkoski, the company's hybrid flashstrose and inmemorials have led to a dramatic shrinking server installation area."We can reduce 450 Cassandra (database) nodes to 60. For CIOs and CTOs," Copernics is the moment. "
"The only way for people to believe in us is to have them demonstrate their concepts."
"CIO, a major telecommunications carrier I recently responded, operated thousands of servers (for the NOSQL database). We can reduce $ 350,000 a year with each of the 50 nodes in that database."Mr. Balkoski)
Another spiked spiker explained that although similar techniques were used in Google and Facebook, the company would not be closed.
Gridgain Systems founder and CTO Nikita Iwanov knows Aerospike well, and is based on "Apache Ignite", although he knows the speed of the company's database technology for similar configurations.It claims that the Gridgain database is even higher in that it is a complete in -memory, not a flash.
CEO Ave Kleinfeld says that digital transformation as a process accelerates the adoption of inemori computing.The conventional branch -type data warehouses and operation database models have insufficient agile for such applications.
Although "Gridgain is like an open source" SAP HANA "", SAP HANA is "because it is a professional prey, high -end and expensive price", for companies that have not invested in SAP technical models so much.Kleinfeld points out that it will not be adopted.
"The reason SAP HANA has such a huge number of customers is because (SAP is incorporated in the HANA) application. Customers do not use HANA for non -SAP applications. Most companies are now green.He has an open source -first approach to field applications. This world is more advantageous for our approaches than professional predi SAPs, Oracle and Microsoft approaches. "
Kleinfeldo has recently given the financial service Barclays, Society Generale, ING, and Huawei, Huawei, and Ignite achieved about 1 million downloads in APACHE.He explained that it became a project with many commitments.Gridgain's paid services are about 100 customers.
According to Gridgain, the Russian bank Sperbank, which has been a customer of the company early, has built the world's largest inemori database cluster comparable to Amazon Web Services and Alibaba.According to Kleinfeld, the company's revenue has been almost doubled since 2017, and the number of employees has increased by 80 %.
Waterline Data was also visited again.The company's data catalog technology automatically adds the guessed business label and supports data discovery is a sales point.
This time, the Dashboard on the EU's General Data Protection Rules (GDPR), which was enthusiastically presented by the founder and CEO Alex Gorelick's team, was to emerge the silo of the personal information store in the organization.Data has been developing.
A customer lamented, "What should I do?", But Goric believes that the rules are good in terms of simply paying a fine.
CreditSafe, the company, which the company has example, is automatically identifies GDPR -related personal data using this technology.GLAXOSMITHKLINE (GSK) was introduced as a customer who uses a wide range of technologies of Waterline Data, although the GDPR system does not use.
Mark Ramsey, the highest data manager in charge of data research and research in GSK, explained that Waterline Data products were used for systematic analysis of large amounts of scientific data.The catalog software is said to have a function that can dynamically analyze the distributed data while crossing the schema and attributes.
Ramsey says that GSK is giving the theory of scientific data in accordance with the pharmaceutical industry standards, but not enough to find it.There is a Waterline Data turn there.
"We are currently increasing data lake data and are building a dashboard to refer to the data. By increasing the introduction of Waterline Data to this environment, we have a self -service opportunity and science.It will be easier for people and researchers to work. Data discovery will be easier, it will be easier to understand the location and system, and it will be possible to directly access and analyze. We are now researchers.There is a pre -service mode that defines what to access through the induced analysis.
A collection of data is truly open for scientists. "Gorek said so.