Ref. Ares(2020)3255548 - 23/06/2020
Palantir Foundry
Managing business complexity and
accelerating value chain
transformation
Palantir Foundry
Palantir Foundry
Palantir Foundry is a platform for managing business complexity. It provides flexibility for both technical and
non-technical users to integrate, manage and interact with their data—and create lasting business value from
those inputs.
1 – Overview
We work closely with our partners to help them realize the value in their data. The most pressing questions
that organizations need to answer rely on many sources of data, but systems are rarely designed for making
connections with other systems.
When questions arise that must be answered with data from multiple source systems, a connection must be
built between each source system and each analytic system. If you’re trying to access only one database via
one application, this system is sufficient—but it doesn’t scale. When you try to add multiple analytical tools
and data sources, you create a behemoth of interconnecting systems. Things become even more complex if
you want to transform or pipeline your data.
When we looked at the systems that prevent our customers from using their data effectively, we realized that
we needed a way to manage data like we manage knowledge: by creating a single source of truth for people
throughout an organization to contribute to and consume. Drawing on our experience working with systems
that range from legacy COBOL mainframes to cutting-edge data lakes, we designed a data layer that makes
it simple to use data to drive business outcomes.
Palantir Foundry integrates and stores data, offers basic and advanced data transformation tools, and
connects directly to Palantir-built and third-party analytical tools. Palantir Foundry encodes our experience
into a comprehensive commercial software platform that simplifies, accelerates, and improves management
of information throughout the data lifecycle.
DATA HUB
REPORTS
DATA LAKE
DATAMARTS
ENTERPRISE
AD HOC
MODELS
MODELS
ANALYTIC
TOOLS
Copyright © 2018 Palantir Technologies Inc. All rights reserved. The information in this document is proprietary and
confidential, and contains certain trade secrets. Disclosure without the prior written approval of Palantir Technologies
2
Inc. is strictly prohibited. The content provided herein is provided for informational purposes only and shall not create
a warranty of any kind.
Revised 07/18.
Palantir Foundry
2 – Technology Overview
We designed Palantir Foundry to support the core principles of data governance:
Create a single, flexible
Data is stored in the rawest form possible to keep up with evolving
repository for all data
requirements for data management and analysis.
Palantir Foundry retains the complete history of all data that enters the
Audit and version data
system. Versioning and provenance ensure that you can always understand
throughout the pipeline
and reproduce how a version of a dataset was generated.
Scale to accommodate
Palantir Foundry scales on commodity hardware or commercial cloud services
constantly growing data
to accommodate increases in data size, user numbers, and data catalog size.
environments
With open interfaces and open data formats, Palantir Foundry is designed to
Interoperate within a
interoperate with Palantir-built and third-party technologies. Palantir Foundry
landscape of systems
can receive, extract, route, and push data to and from any source in a modern
and tools
data ecosystem.
Palantir Foundry provides tools that are suitable for a range of user profiles:
Collaborate without
data scientists, developers, analysts, and operational users. Built-in
sacrificing security or
capabilities for versioning and data protection contribute to a high-integrity
data integrity
enterprise knowledge base.
In this document, we describe how these capabilities interact to provide a complete information layer
for today’s data-driven enterprises.
3 – Flexible Data Management
Integrate and transform data on demand
We built Palantir Foundry using industry standard technologies for efficient storage, processing, and
transformation of massive-scale data. Foundry integrates common data sources—HDFS, JDBC and SQL
databases, flat files, etc.—out of the box and can be configured to support any other source system or
legacy technology. Data is stored in standard formats so other systems can easily access the raw,
persisted information. Data is captured without intermediate processing so you can:
— Decide on demand—based on the immediate analytical need—how to
access, model, and query each particular data source
— Explore new analytical questions and data sources as you iterate on
high-integrity enterprise models that other systems can use
Copyright © 2018 Palantir Technologies Inc. All rights reserved. The information in this document is proprietary and
confidential, and contains certain trade secrets. Disclosure without the prior written approval of Palantir Technologies
3
Inc. is strictly prohibited. The content provided herein is provided for informational purposes only and shall not create
a warranty of any kind.
Revised 07/18.
Palantir Foundry
We’ve discovered that data is most valuable when it isn’t locked into a strictly defined schema. By
encouraging the use of different models for different use cases, Palantir Foundry ensures that
organizations aren’t tied to a single data model that limits the kinds of questions that can be asked and
answered.
Work with large-scale data efficiently Palantir Foundry can integrate data in near-real time by capturing differential updates as new versions
of a dataset. For example, to perform a routine extraction pull from a previously integrated data source
that has been updated with new information, Foundry records only the new information rather than
integrating the entire dataset again.
By integrating new information incrementally, Palantir Foundry enables low-latency updates—on the
order of seconds—so small changes propagate rapidly to the underlying data sources throughout the
system. Foundry also supports the use of distributed back-end stores and distributed computation
engines to make transformations efficient at scale.
DATA
FOUNDRY
DATA
FOUNDRY
Palantir Foundry connects new data incrementally to enable low-latency updates
Update and manipulate datasets and schemas
Foundry CLI is a Java-based command line tool that lets users work with Foundry. Foundry CLI
provides a simpler mechanism than tools like cURL for users to easily create and update data.
Copyright © 2018 Palantir Technologies Inc. All rights reserved. The information in this document is proprietary and
confidential, and contains certain trade secrets. Disclosure without the prior written approval of Palantir Technologies
4
Inc. is strictly prohibited. The content provided herein is provided for informational purposes only and shall not create
a warranty of any kind.
Revised 07/18.
Palantir Foundry
4 – Enterprise Auditing
We designed Palantir Foundry to promote analytical integrity and data accuracy. Foundry’s approach to
versioning and provenance ensures that you can always understand and reproduce how a particular
version of a dataset was generated.
Conduct high-integrity analysis through versioning and traceability
Palantir Foundry captures data in discrete, immutable updates that are aggregated to form the current
state of the data. By maintaining a record of the information used to produce the output, Foundry
ensures that results can always be reproduced. Users can review validation checks in an audit log,
where records are captured and logged with data.
Understand data provenance
Palantir Foundry maintains a comprehensive history of all data, including an archive of all raw
information from every source system. Foundry’s versioning infrastructure stores data, the metadata
associated with the process of integration, and any subsequent transformations.
Manage datasets intuitively
Palantir Foundry offers a more intuitive alternative to the command line that lets you view and manage
datasets, transactions, and jobs in a web application.
Dataset and transaction
View information about all datasets available in Foundry and their associated
management
transactions
View information about past and in-progress jobs, including basic metadata,
Job management
drill down on a particular job to see more detailed information like associated
datasets and transactions
Automate and track complex job orchestration
Palantir Foundry promotes data integrity by providing a framework for job orchestration. Using
Foundry’s management console, you can view details about the state of the system and jobs in
progress. Typical production pipelines might contain hundreds of data sources and many more
transformations. By automating job orchestration, Foundry eliminates the risk of error associated with
manually tracking which transformations need to be run in what order.
The job orchestration framework handles these requests by assigning the necessary jobs to a job
execution service. It generates the necessary transactions, executes the appropriate jobs, logs any
output, and updates the transaction state based on the results.
Copyright © 2018 Palantir Technologies Inc. All rights reserved. The information in this document is proprietary and
confidential, and contains certain trade secrets. Disclosure without the prior written approval of Palantir Technologies
5
Inc. is strictly prohibited. The content provided herein is provided for informational purposes only and shall not create
a warranty of any kind.
Revised 07/18.
Palantir Foundry
5 – Scalability and Interoperability
We built Palantir Foundry to grow with your enterprise by targeting the issues that make systems
difficult to resize. When capacity must be defined up front, systems either quickly become too small or
incur costs for unused capacity.
To enable efficient and elastic scaling, Foundry:
— Scales horizontally across commodity servers, whether using on-
premise hardware or commercial cloud infrastructure
— Uses computing resources efficiently, with Distributed File Systems
(e.g., Hadoop, Amazon S3) as a backing store to improve query
efficiency and spread operations over multiple systems
— Maximizes performance and storage density
Manage data flexibly with an open system
Many data management systems lock you into a solution by storing data in proprietary formats or
closing off systems with proprietary APIs. Palantir Foundry stores data in open formats and exposes
open APIs to facilitate interoperability.
Palantir Foundry exposes data to external services via several interfaces. You can easily export
structured data from Foundry to CSVs and databases via direct query or a web API.
Copyright © 2018 Palantir Technologies Inc. All rights reserved. The information in this document is proprietary and
confidential, and contains certain trade secrets. Disclosure without the prior written approval of Palantir Technologies
6
Inc. is strictly prohibited. The content provided herein is provided for informational purposes only and shall not create
a warranty of any kind.
Revised 07/18.
Palantir Foundry
Foundry provides several “push” mechanisms, including file transfer and common standard
connections (e.g., JDBC/ODBC drivers that enable Java applications to interact with databases). External
systems can also “pull” from Foundry via methods like common drivers and RESTful Web Service APIs
and by connecting directly to the Distributed File System where data lives. As an integrated ecosystem,
Foundry includes the base data management layer, an authoring environment for data transformations,
a suite of user-facing analytical applications, and developer frameworks & open APIs for building
operational applications.
6 – Security and Collaboration
Palantir Foundry’s granular access control framework lets you secure information at the dataset level
and assign specific degrees of access for different user groups. For each individual dataset, you can
define the users who are permitted to discover, read, modify, and delete the data.
Palantir Foundry maintains an audit trail that captures all user activity within the platform. For every
user action—read, write, deletion—Foundry captures what data was accessed, where, when, and by
whom. Foundry also captures a detailed history of integration, including time of connection, source, and
revision history. This metadata is used to track data provenance and manage compliance with data
auditing and retention policies.
To protect data at rest, Foundry leverages the encryption mechanism provided by the storage back end
(e.g., full disk encryption, Kerberos). Alternatively, you can configure Foundry to store data in an
encrypted format. In transit, data is encrypted via SSL/ TLS, during both client-to-server and server-to-
server communications.
Palantir Foundry’s security model and versioning capabilities contribute to a collaboration framework
that breaks down the barriers that prevent cross-organizational information sharing. Foundry lets you
define types or groups of users who can collaborate (e.g., data scientists, system administrators,
business analysts, etc.). The security model ensures that data is only exposed to users with the right
permissions.
Palantir Foundry’s role-based security model enables seamless
cross-functional collaboration among different user groups
Copyright © 2018 Palantir Technologies Inc. All rights reserved. The information in this document is proprietary and
confidential, and contains certain trade secrets. Disclosure without the prior written approval of Palantir Technologies
7
Inc. is strictly prohibited. The content provided herein is provided for informational purposes only and shall not create
a warranty of any kind.
Revised 07/18.