Sweep launches the Agentic Assessment to power your Agentforce journey
Why Sweep?
Product
Solutions
Pricing
Blog
Customers
Watch a quick tour
Sign in
Back to blog
Nick Gaudio, Metadata Expert of 8 Years
Nick GaudioHead of Brand and Content , November 24, 2025

Snowflake Metadata: A Beginner’s Guide

Snowflake Metadata Basic Guide
Start free
Share
Copied!
Snowflake Metadata Basic Guide

Snowflake is where your data lives. Metadata is how it talks.

If you’re just getting started with Snowflake, “metadata” can sound like one of those abstract governance words everyone nods along to in meetings. Ah yes. Meta branded Data.

In reality, Snowflake metadata is very concrete and (thankfully) very queryable. It tells you what you have, how it’s structured, who’s using it, and how it’s changing over time.

And more and more, it decides whether your AI and analytics projects feel magical… or miserable.

This guide walks through how Snowflake metadata actually works and what you can do with it, even if you’re not a full-time data engineer.

1. What Snowflake metadata actually is

Let’s strip away the jargon.

In Snowflake, metadata is simply data about your data and how it’s used. Snowflake keeps track of the objects in your account (databases, schemas, tables, views, tasks, models, and so on).

It knows the structure of those objects: their columns, data types, clustering, and constraints. It records who runs which queries, how long they take, how many bytes they scan, and which warehouse they use. It tracks roles, grants, tags, masking policies, and classifications. And it understands how objects depend on each other—what feeds what, and what might break if you change something.

All of that is stored in Snowflake’s internal catalogs and exposed as system views and table functions. You query them with normal SQL. If you’ve ever run:

SELECT * FROM MY_DB.INFORMATION_SCHEMA.TABLES;

you’ve already worked with Snowflake metadata. You just may not have called it that yet.

2. The three main views of Snowflake metadata

Snowflake gives you three big “lenses” on metadata: INFORMATION_SCHEMA, ACCOUNT_USAGE, and Horizon. They all describe the same universe, but at different levels of zoom.

INFORMATION_SCHEMA: the local map

Every database has an INFORMATION_SCHEMA that conforms roughly to ANSI SQL conventions. It’s your local map of what exists inside that database: tables, views, columns, functions, procedures, and so on.

If you want to know “what’s in this database?” or “what does this table look like?”, you start here. For example, to list tables and views:

SELECT table_schema, table_name, table_type FROM MY_DB.INFORMATION_SCHEMA.TABLES ORDER BY table_schema, table_name;

And to inspect a specific table’s columns:

SELECT column_name, data_type, is_nullable FROM MY_DB.INFORMATION_SCHEMA.COLUMNS WHERE table_schema = 'CORE' AND table_name = 'CUSTOMERS' ORDER BY ordinal_position;

INFORMATION_SCHEMA is more like snapshot-style. It shows you the state of things right now: what exists, how it’s defined, and who owns it.

ACCOUNT_USAGE: the account-wide time machine

The second lens is the SNOWFLAKE.ACCOUNT_USAGE schema. This is where Snowflake exposes account-wide, historical metadata. It looks very similar to INFORMATION_SCHEMA at first glance — there are views for tables, columns, objects, and so on — but it also includes rich query history, access history, and governance information.

This is where you go when the question isn’t “what exists?” but “what’s actually happening?” or “what happened last week?”

For example, to see queries from the past seven days:

SELECT query_id, user_name, start_time, total_elapsed_time, bytes_scanned FROM SNOWFLAKE.ACCOUNT_USAGE.QUERY_HISTORY WHERE start_time >= DATEADD('day', -7, CURRENT_TIMESTAMP()) ORDER BY start_time DESC;

This view tells you who is hitting the system, which workloads are expensive, and how performance is trending over time. Other views in ACCOUNT_USAGE tell you which objects have been dropped, which warehouses are consuming the most credits, which roles are assigned to which users, and so forth.

You can think of ACCOUNT_USAGE as the audit trail and black box recorder for your Snowflake account.

Horizon: the governance and discovery brain

On top of those SQL views, Snowflake offers Horizon, its governance and discovery layer.

Horizon is a UI and feature set built on top of the metadata you’ve just seen. It lets you search for objects, see data lineage visually, manage tags and classifications, and reason about policies and access patterns.

Under the hood, Horizon is reading the same metadata you can query yourself... it’s just doing the capital W Work of stitching it into a graph and making it navigable for us humans.

INFORMATION_SCHEMA helps you answer, “What’s here?”
ACCOUNT_USAGE helps you answer, “What’s happening?”
Horizon helps you answer, “How does it all connect—and is it governed the way we want?”

3. How Snowflake captures your metadata without you lifting a finger

One of the nice things about Snowflake is that you don’t have to “turn on” metadata. It’s collected continuously in the background for you.

Whenever you create, alter, or drop an object, Snowflake updates its internal catalog. That change shows up in INFORMATION_SCHEMA for the relevant database and in ACCOUNT_USAGE views like OBJECTS, TABLES, and COLUMNS. You can see when a table was created, who owns it, whether it has been dropped and recreated, and so on.

Every query you run is also captured in query history. Snowflake records the text of the query, the start and end times, the warehouse that executed it, and the user and role that initiated it. For write operations — like INSERT, MERGE, or CREATE TABLE AS SELECT —Snowflake extends this with access history, which shows which source objects were read and which target objects were written. In many cases it can even track how individual source columns contribute to target columns.

This is the raw material for lineage and impact analysis. It’s how you move from “we think this pipeline depends on that table” to “we know these specific queries read these specific columns.”

On top of that, Snowflake supports tags and automatic classification. Tags are key–value pairs you attach to objects: things like pii = 'true', owner_team = 'revops', or retention = '7_years'. They can be applied at different levels (database, schema, table, column, warehouse) and can inherit or auto-propagate along data flows. Automatic classification can detect likely PII and apply system tags based on patterns, which you can then use to drive masking policies and access controls.

Together, these turn metadata from a passive record into something closer to a policy engine.

4. Lineage: from “what is this table?” to “what happens if I change it?”

Most teams eventually care about lineage, even if they don’t use the word.

There are two kinds that matter day-to-day. The first is structural lineage: which objects depend on which others. If a view reads from a base table, Snowflake tracks that dependency. If you try to drop the base table, Snowflake can warn you that something downstream might break. Horizon’s lineage view lets you click into a table and see its parents and children: the sources that feed it and the views, models, or dashboards that rely on it.

The second kind is data lineage: how data actually flows through your system. This is where access history comes in. For each query, Snowflake records which objects were read, which were written, and which columns were involved. When a pipeline writes from a staging table into a mart table, that movement is captured in metadata.

Once you have that, you can start answering more interesting questions. If you’re looking at a sensitive field in a mart table, you can trace it back to see where it came from and what transformations happened along the way. If you’re planning to deprecate a legacy table, you can scan for queries that still reference it and identify the downstream objects that would be affected.

If a dashboard starts showing obviously wrong numbers, lineage can help you work backward through the chain of dependencies to find the first broken link.

Lineage is what turns Snowflake from a big pile of tables into a knowable system.

5. Why metadata matters even if you’re not “a data person”

If you’re a RevOps leader, an admin, or a CIO, it’s easy to assume Snowflake metadata hygiene is something your data team deals with in the background. But the way you use metadata has a very direct impact on discoverability, trust, cost, and AI readiness.

On the discovery side, metadata hygiene in Snowflake replaces tribal knowledge with searchable, inspectable facts. Instead of asking “who knows where the good customer table is?”, you can search for objects tagged as core, see how often they’re queried, and understand which teams own them.

Horizon plus a few targeted SQL queries can give you a far more honest picture of your data landscape than any slide deck.

For trust and governance, metadata is the backbone. You can’t protect what you can’t see. Tags and classifications let you mark sensitive data; policies and lineage lets you reason about where that data goes and who can touch it. When regulators or security teams ask how certain fields are used, metadata gives you something better than “we think”: it gives you a trace.

Cost and performance are also metadata problems in disguise. Query history and warehouse metering tell you where you’re burning credits. Access patterns tell you which tables justify their storage and which ones are zombie remnants from a long-dead project. When you have that feedback loop, you can tune warehouses, refactor pipelines, and archive unused data from a place of knowledge instead of guesswork.

And then there’s AI. Every serious attempt to put agents or LLMs on top of Snowflake runs into the same wall: the model can’t reason about a warehouse it doesn’t understand. To route questions intelligently, compose queries safely, or respect governance constraints, an agent needs to know what data exists, which tables are canonical, how fields relate to each other, and where the sharp edges are. All of that is metadata. If you skip this step, you’re effectively trying to build a self-driving car without a map.

6. A simple way to start working with Snowflake metadata

You don’t need a full metadata program or a dedicated platform to get started. A few simple practices make a huge difference.

First, take an inventory of what you actually have. Use INFORMATION_SCHEMA to list tables and views in your most important databases and then inspect a handful of critical tables in more detail. The goal here isn’t perfection; it’s to replace the vague sense of “we have a lot of stuff in Snowflake” with a concrete feel for what’s there.

Next, look at how it’s being used. Move into SNOWFLAKE.ACCOUNT_USAGE and explore query history. Identify the most expensive queries over the past week or month. See which warehouses are doing the heavy lifting.

Look for tables that never appear in query text over a reasonable window. You’ll almost certainly discover a mix of hot paths you need to protect and cold data you can archive or delete.

Finally, start tagging and tracing the things that matter most. Pick a few critical domains — customers, revenue, product usage — and define a simple tag scheme around sensitivity, ownership, and domain. Apply those tags to your key tables and columns. Then use access history and Horizon’s lineage view to see how that tagged data flows across the rest of your Snowflake estate.

You don’t need to tag everything everywhere. Start where the risk and value are highest and let your metadata coverage grow from there.

7. How Sweeps fit into your Snowflake metadata story

Everything above is doable with out-of-the-box Snowflake features. Many teams start exactly that way: someone writes ad hoc queries against ACCOUNT_USAGE, someone else clicks around Horizon, and a third person holds “the real picture” in their head or in a half-updated diagram.

That approach works — right up until you try to scale it across multiple domains, keep it up to date continuously, and feed it to agentic AI.

At Sweep, we treat Snowflake metadata as a living blueprint for both automation and agents. Our metadata agents continuously read Snowflake’s metadata — objects, usage, tags, access history — and stitch it together with the rest of your operational graph, including Salesforce. They build a real-time view of how your systems actually connect, not just how they were supposed to connect when someone drew the architecture diagram.

On top of that map, we surface issues that are almost impossible to catch manually: orphaned tables and zombie pipelines, conflicting definitions of core entities like “Account” or “Opportunity,” risky PII flows that don’t align with your policies, and subtle forms of systems drag that quietly slow everything down. Because we speak the language of metadata, we can also give AI agents the context they need to act safely: what’s canonical, what’s deprecated, where governance constraints apply, and which changes will have the biggest impact.

Instead of begging humans to maintain spreadsheets, Confluence pages, and diagrams, you let metadata agents keep the map fresh for you—and for your AI.

Sweeping it up

If Snowflake is your warehouse, metadata is your wiring diagram.

You don’t have to memorize every system view or become a full-time catalog administrator. But if you can pull a basic inventory, read query and access history, understand lineage for your key domains, and put a minimal tagging scheme in place, you’re already ahead of most teams trying to “do AI” on top of an opaque warehouse.

And if you’re ready to turn that metadata into a continuously updated blueprint that powers both humans and agents — across Snowflake, Salesforce, and beyond — that’s exactly the problem Sweep is built to solve!

Learn More

Impact Analysis
Process Mapping
AI-powered Documentation
CPQ Documentation
Build & Deploy
Automations
Lead Routing
Alerts
Deduplication & Matching
Marketing Attribution
Agentic Layer
Metadata agents
Model Context Protocol (MCP)
Agentic workspace
Agentic Assessment for Agentforce
Company
About
Privacy
Terms
Accessibility
Cookies Notice
Careers
Resources
Case Studies
FAQs
Blog
2025
Sweep
SOC2 Compliant