Every time you take a photo, save a document, or download a video, your device is secretly writing a diary entry alongside the actual content. You see the image of your vacation or the text of your report, but buried deep inside those files are dozens-sometimes hundreds-of invisible tags. These metadata fields are data about data that describes the origin, creation, and technical specifications of a digital file. They include everything from the exact GPS coordinates where a photo was taken to the total editing time spent on a spreadsheet.
You might think that deleting the visible content is enough to protect your privacy, but these hidden fields often tell a much more revealing story. A simple JPEG can expose your home address, the model of your camera, and even the serial number of your phone. A PDF can reveal who created it, what software they used, and when it was last modified. The problem isn't just one type of tag; it's the sheer volume of different schemas at work. There is no single "complete list" because different file types use different languages to store this information.
Why There Is No Single Universal List
If you search for a definitive catalog of every metadata field possible, you will hit a wall. That is because metadata is not standardized across all of computing. Instead, it is fragmented into various ecosystems, each with its own rules and field names. A field called "Creator" in one system might be labeled "Author" in another, or simply ignored entirely by a third.
To understand what is hiding in your files, you have to look at the major layers of metadata storage:
- File System Metadata: Information added by your operating system (Windows, macOS, Linux) regarding file size, permissions, and timestamps.
- Embedded Standards: Protocols like EXIF, IPTC, and XMP that live inside the file structure itself.
- Application-Specific Properties: Data added by specific software, such as Microsoft Office or Adobe Photoshop.
- Custom Fields: User-defined tags added by enterprise systems or digital asset managers.
This fragmentation means that a tool capable of removing metadata must be able to read multiple distinct formats simultaneously. If a cleaner only strips EXIF data from an image but leaves the XMP stream intact, your location data could still be exposed.
The Core Descriptive Layer: Dublin Core
One of the most widely recognized frameworks for describing resources is the Dublin Core a set of cross-domain metadata standards used to describe digital objects. While originally designed for libraries and archives, many modern applications borrow from its logic. The standard defines 15 core elements, though extended profiles often add more. Here are the key fields you might encounter under this umbrella:
- Title: The name given to the resource.
- Creator: The entity primarily responsible for making the content.
- Subject: Keywords or topics describing the content.
- Description: An abstract or summary of the resource.
- Publisher: The entity responsible for making the resource available.
- Date: Key dates associated with the resource, such as creation or release.
- Type: The nature or genre of the content (e.g., image, dataset).
- Format: The physical or digital manifestation, such as JPEG or PDF.
- Identifier: A unique reference, like a DOI or internal ID.
- Rights: Information about rights held in and over the resource.
In academic or research contexts, additional fields like Funders, Checksums (to verify data integrity), and Provenance (history of ownership) are often appended. Even if you are just sharing a casual photo, your camera app may be mapping its internal data to some of these concepts, labeling you as the "Creator" and embedding the "Date" of capture.
Image Metadata: EXIF, IPTC, and XMP
Images are notorious for carrying heavy loads of personal data. When you snap a picture with a smartphone or DSLR, the file usually contains three parallel sets of metadata:
- EXIF (Exchangeable Image File Format): Primarily technical data. This includes the camera make and model, lens serial number, shutter speed, aperture, ISO, flash usage, and crucially, GPS coordinates latitude and longitude data embedded in image files that reveal the precise location where a photo was taken.
- IPTC (International Press Telecommunications Council): Focuses on descriptive and copyright information. Fields here include captions, keywords, author name, contact details, and model release information.
- XMP (Extensible Metadata Platform): A flexible container developed by Adobe that can hold both EXIF and IPTC data, plus custom tags. It is often used to store editing history and software versions.
Consider a photo uploaded to a public forum. The EXIF data might reveal that the photo was taken with an iPhone 15 Pro Max at 8:42 PM on June 3, 2026. The GPS tags pinpoint the location to within a few meters. The XMP data might show that the image was edited in Lightroom Classic version 13.1. Without stripping these fields, anyone downloading the image can extract this entire timeline and location profile.
Document Metadata: Office and PDF Structures
Documents are equally guilty of leaking information, but the structure is different. Modern Office files (DOCX, XLSX, PPTX) and OpenDocument files (ODT, ODS) are actually ZIP archives containing XML files. Inside these archives, metadata lives in specific property files:
- Core Properties: Author, title, subject, category, comments, and keywords.
- Application Properties: Template name, total editing time, page count, word count, and revision number.
- Custom Properties: Any additional fields added by templates or macros.
PDFs are even more complex. A standard PDF contains two separate metadata stores: the older Info Dictionary and the newer XMP Stream. The Info Dictionary holds basic fields like Creator, Producer, CreationDate, and ModDate. The XMP stream can contain thousands of lines of detailed data, including software version history and color profiles. Many naive metadata cleaners only wipe the Info Dictionary, leaving the XMP stream-and all its secrets-intact.
For example, a resume saved as a DOCX might reveal that you spent 4 hours editing it (Total Editing Time) and that it was created using a template from a specific university career center (Template). A legal contract in PDF format might retain the name of the law firm that drafted it in the Producer field, even after you remove the visible text.
Operating System and Shell Properties
Before a file even reaches an application, your operating system assigns it a layer of metadata. In Windows, for instance, File Explorer uses a shell property handler to display attributes. A PowerShell script can enumerate up to 295 distinct column indices for a single file type. Common OS-level fields include:
- Name: The filename.
- Date Created / Modified / Accessed: Timestamps managed by the filesystem.
- Size: File size in bytes.
- Owner: The user account that owns the file.
- Attributes: Read-only, hidden, or system flags.
On macOS, the Finder adds extended attributes and Spotlight comments. These OS-level tags are often overlooked because they don't travel with the file in the same way embedded metadata does. However, if you share a file via a network drive or cloud sync service, these attributes can sometimes persist or be replicated, potentially revealing who accessed the file and when.
Specialized and Scientific Metadata
In professional or scientific workflows, metadata becomes highly specialized. Geospatial imagery, for example, includes fields like Band Names, Cloud Cover, Sun Azimuth, and Solar Irradiance. These fields are critical for analyzing satellite data but irrelevant to a casual user. Similarly, video files (MP4, MOV) store metadata in atoms like udta (user data) and moov (movie metadata). These can contain recording devices, editing software, and even lyrics or artist credits for music videos.
Enterprise document management systems and e-discovery platforms add yet another layer. Tools like Everlaw or Folderit allow users to define custom metadata fields for tracking purposes. A document might carry tags like Admin Rating, Bates Number, Redaction Stamp Details, or Viewed By. These fields are not part of the original file but are injected during processing, creating a rich audit trail that can be just as sensitive as the content itself.
How to See What Is Hidden in Your Files
You cannot remove what you cannot see. Most users have no idea how much data their files carry until they inspect them. Here is how you can check:
- Windows: Right-click a file, select Properties, and go to the Details tab. This shows a subset of the metadata, but not everything. For a deeper dive, developers use PowerShell scripts to query the Shell.Application COM object.
- macOS: Right-click a file, choose Get Info. Look under the More Info section for EXIF data in images or creator info in documents.
- Online Inspectors: Various web tools allow you to upload a file to view its raw metadata. However, uploading files to unknown servers poses a privacy risk in itself.
A safer approach is to use a browser-based tool that processes files locally. For instance, Vaulternal's Metadata Remover offers an inspector mode that lets you view all hidden fields before deciding to strip them. Because it runs entirely in your browser using WebAssembly, the file never leaves your device, eliminating the risk of server-side leaks.
Stripping Metadata Safely and Completely
Once you know what is there, the next step is removal. Simply renaming the file or converting it to a different format (like saving a Word doc as a PDF) often fails to clear all metadata. The new format may inherit old tags or generate new ones.
Effective metadata removal requires parsing the file structure and rewriting it without the hidden blocks. For images, this means stripping EXIF/IPTC/XMP headers while preserving pixel data. For PDFs, it involves clearing both the Info Dictionary and the XMP stream. For Office files, it means editing the underlying XML properties.
When choosing a tool, prioritize client-side processing. Server-based removers require you to upload your sensitive files, which defeats the purpose of protecting your privacy. Look for tools that offer a "view and remove" dual mode, allowing you to audit the data first. Additionally, features like JSON export of removed fields can provide an audit trail for compliance purposes, proving exactly what data was stripped and when.
Is there a complete list of all metadata fields?
No, there is no single universal list. Metadata is defined by various standards (EXIF, IPTC, XMP, Dublin Core) and operating systems, each with hundreds of fields. The specific fields present depend on the file type, the software used to create it, and any custom tags added by enterprise systems.
What is the most dangerous metadata field in a photo?
GPS coordinates are often considered the most sensitive, as they reveal the exact physical location where a photo was taken. Other risky fields include the camera serial number, which can sometimes be traced back to a specific purchase record, and the creation timestamp, which can establish an alibi or timeline.
Does saving a file as a new format remove metadata?
Not necessarily. Converting a file (e.g., DOCX to PDF) often carries over existing metadata or generates new tags based on the conversion software. To ensure complete removal, you must use a dedicated metadata stripper that explicitly targets and deletes these hidden blocks.
Can I see metadata without uploading my file?
Yes. Browser-based tools that run client-side (using JavaScript or WebAssembly) can inspect and modify files locally. This means the file never leaves your computer, ensuring privacy. You can verify this by checking your browser's network tab while the tool is running.
What is the difference between EXIF and XMP?
EXIF is primarily used for technical camera data like settings and GPS. XMP is a more flexible, extensible format that can contain descriptive data, rights information, and custom tags. Modern images often contain both, so effective cleaning requires addressing both schemas.