Skip to content

METHODOLOGY

How LiveArt's AI actually works.

We believe the art market deserves AI that is both powerful and transparent. This page describes how LiveArt Estimate™ and the broader AI stack are built — honestly, including what they don't do.

LiveArt Estimate™ — artwork cards with current estimate, market cap, and 12-month momentum.
10M+

Auction records · 1986–today

100+

Features per artwork

3

Specialized models · one estimate

Walk-forward

Validation · year N → N+1

HOW LAE IS BUILT

Three questions. Three models.

LiveArt Estimate™ answers one question — what is this artwork worth right now? — by separating it into three distinct sub-questions, each answered by a different model. The three components compose into a single price with a confidence range.

THE THREE QUESTIONS

01

What is this type of work worth, in the abstract?

Base price

02

How does this specific type of work drift in value over time?

Artwork-specific trend

03

How is the art market as a whole doing?

Market trend
◆ THE THREE COMPONENTS

COMPONENT 3 · MARKET

Market trend

Repeat-sales regression

Tracks pure market cycles by analyzing works that have sold more than once. Differencing pairs of sales isolates market movement from artwork-specific drift.

COMPONENT 2 · ARTWORK

Artwork-specific trend

Gradient-boosted decision trees

Captures how specific work categories — by artist, medium, period, size — drift relative to the overall market. Tree-based models capture interaction effects a single regression would miss.

COMPONENT 1 · BASE

Base price

XGBoost regression

The intrinsic value of an artwork independent of market timing. Trained on 100+ features per artwork: artist, medium, size, period, provenance, and more.

PUTTING IT TOGETHER

One artwork. One estimate.

For any artwork at any point in time, the final LAE is the sum of these three components. The result is a current estimated price with a confidence range — and, because the components are time-aware, an estimate for any historical point.

DESIGN DECISIONS

Why three models, not one.

The obvious question is why not use one big model. The answer comes down to three properties of the art market that a single model handles poorly.

01

Data is sparse per artwork

Most artworks transact once or not at all. A model that needs many observations per artwork has nothing to learn from. Pairs analysis on repeat sales sidesteps this.

02

Quality is unobserved

Artist significance, retrospective inclusion, condition nuance — these matter to price but aren't in the data. Differencing techniques cancel out these latent factors when comparing repeat sales.

03

Interactions dominate

A Basquiat painting and a Basquiat drawing don't trend together. A linear model can't capture that. Tree-based models can — but only if the question they're solving is narrow enough.

Signal leak — where one model's output contaminates another's training data — is the most common failure mode in art-market AI. Separating questions enforces clean inputs.

BEYOND THE CURRENT ESTIMATE

A complete price history,
not just a point estimate.

Because LAE is built on a time-aware architecture — the three components each carry time information — the model can produce an estimated price for any artwork at any point in its history. That makes portfolio analytics possible.

PRO FORMA RETURNS

Returns on hypothetical holding periods. Pick any two dates, get a return — at the artwork, artist, or portfolio level.

INDEX BENCHMARKS

Compare against LiveArt indices, blue-chip cohorts, or traditional benchmarks (S&P 500, bonds, gold). Same currency, same period.

PORTFOLIO ANALYTICS

Sharpe ratios, drawdowns, correlation matrices, volatility — derived from continuous price series, not just confirmed sales.

HONEST LIMITATIONS

What the model doesn't claim to do.

Every model has limits. We publish ours so consumers of LAE can use it appropriately.

01

LAE works best for liquid artists.

Most accurate for artists with sustained auction activity — typically the top 500–1,000 artists by transaction volume. Below that, confidence ranges widen accordingly.

02

Emerging artists are hard.

Markets in rapid expansion show lagging predictions. Historical data alone is a weak predictor of current value when an artist's market is reshaping in real time.

03

Primary market is not in scope.

Gallery and private sale prices are not in the training data. For living artists where the primary market dominates, LAE reflects auction signal only.

04

Auction noise is partially filtered.

Manipulated sales, guarantees, and buy-ins introduce noise that no model perfectly removes. We filter what we can and surface confidence ranges as a reliability indicator.

05

LAE is a starting point, not a final answer.

The model augments — it does not replace the specialist, the appraiser, or the advisor. Confidence ranges exist precisely because a single number is rarely the right answer.

BEYOND LAE

What else the AI does.

LAE is the headline output. Other components of the AI stack run on the same data foundation.

Price momentum

Repeat-sales-filtered 12-month signal at the artist or category level.

Artist embeddings

64-dimensional vectors enabling similarity comparison and clustering across 350K+ artists.

Similarity vectors

Comparable-artwork retrieval based on visual and metadata features — the workhorse behind cataloguing and search.

Market signals

Real-time structured signals from auction calendars, results, and corrections.

Image recognition

Cataloguing workflows: artist attribution, medium detection, edition matching from photos.

Historical LAE

Time-machine estimates for portfolio reconstruction, attribution analysis, and academic research.

MODEL VALIDATION

How we know the model is working.

Validation is the part most AI vendors quietly skip. Here's the approach.

WALK-FORWARD VALIDATION

We train on data through year N and test on year N+1. The sequence: train on 2022, test on 2023. Train on 2023, test on 2024. Train on 2024, test on 2025. This prevents the model from interpolating between known points — a common form of cheating in time-series ML.

TRAIN 2022 → TEST 2023TRAIN 2023 → TEST 2024TRAIN 2024 → TEST 2025

01

Stratified error reporting

Mean absolute error reported by artist tier, price bucket, medium, and region — not just a global headline number.

02

Calibrated confidence intervals

We check that an ±8% range actually contains roughly 80% of realized prices. Miscalibrated intervals are worse than wide ones.

03

Versioned models, published changelogs

Each model carries a version. Material changes ship with a changelog noting what shifted and why. Enterprise clients receive segment-level performance reports.

HOW WE THINK ABOUT AI

The principles behind the methodology.

01

Transparency over mystique.

Published methodology. Visible confidence ranges. No black boxes for prestige.

02

Augment experts, don't replace them.

The model supports specialist judgment. It is a starting point, not a verdict.

03

Purpose-built models, not one giant model.

Each question gets the model that fits it. We resist the urge to throw everything into a single architecture.

04

Continuous validation.

The market changes. The model is retrained, re-tested, and reported on a regular cadence.

Questions about the methodology?

◆ ENGINEERING · TECHNICAL DEEP-DIVE

Talk to the team.

Our engineering team is available to discuss architecture, validation approach, and model performance in detail. Schedule a session for your quants or research desk.

TALK TO ENGINEERING →