Skip to main content

Zoom on Large Datasets

Victory can handle hundreds of data points, but what if you'd like chart thousands of points? VictoryZoomContainer can be useful here, allowing the user to focus on the subset of data they are most interested in.

By default Victory will render all data points in a dataset. For large datasets this behavior can be overridden, and this guide will show you how. If you haven't used it before, read about the VictoryZoomContainer first, then come back to this guide.

In a hurry? Skip to the demo.

Basic scenario: time-series data

In this guide, we'll be working with time-series data. We'll make a few basic assumptions:

  1. zooming will be done only on the x (or time) dimension,
  2. our data will be ordered by x, from earliest to latest, and
  3. the dataset is static - no new data will arrive while the user is interacting with the chart.

These just serve to simplify the example. We'll start with a simple chart:

function CustomChart(props) {
const [state, setState] = React.useState({});

return (
<VictoryChart containerComponent={<VictoryZoomContainer/>}
<VictoryScatter data={props.data} />
</VictoryChart>
);
}

Render only visible points

Rather than passing all our data to the data prop, we'll first remove any data that isn't currently visible. To do this, we must keep track of the chart's visible domain; VictoryZoomContainer has an onZoomDomainChange prop that will allow us to do exactly that:

<VictoryChart
containerComponent={<VictoryZoomContainer
onZoomDomainChange={onDomainChange}
/>}
>
<VictoryScatter data={getData()} />
</VictoryChart>

Update the state every time the domain changes (note that we are only keeping track of the x dimension):

onDomainChange(domain) {
setState({
zoomedXDomain: domain.x,
});
}

Use this zoomedXDomain state to filter out all data that isn't currently visible. Here we're making a simple getData method; note the data array's filter function:

function getData() {
const { zoomedXDomain } = state;
const { data } = props;
return data.filter(
// is d "between" the ends of the visible x-domain?
(d) => (d.x >= zoomedXDomain[0] && d.x <= zoomedXDomain[1]));
}

Because we are dynamically changing the data prop on VictoryChart, we must also explicitly set its domain. By default VictoryChart will calculate a domain from data; in this case that would mean that the chart would "forget" about the rest of the data as the user zoomed in. In fact, there would be no way to zoom back out!

To remedy this, we must calculate the domain of the entire dataset:

function getEntireDomain(props) {
const { data } = props;
return {
y: [_.minBy(data, d => d.y).y, _.maxBy(data, d => d.y).y],
x: [ data[0].x, _.last(data).x ]
};
}

We use Lodash's minBy and maxBy functions to find the domain of y. Because we assume that the data is ordered by x, we can use the first and last points for the x domain.

Because we are assuming the data is static we just need to call this function once, in the constructor of CustomChart. The calculated x domain will also be used as the initial value for state.zoomedXDomain:

const entireDomain = getEntireDomain(props);
const [state, setState] = React.useState({
zoomedXDomain: entireDomain.x,
});

The static value entireDomain can then be used by VictoryChart:

<VictoryChart
domain={entireDomain}
containerComponent={<VictoryZoomContainer
zoomDimension="x"
onZoomDomainChange={onDomainChange}
/>}
>
<VictoryScatter data={getData()} />
</VictoryChart>

Now we are only rendering the visible points, but this step isn't nearly enough: when the chart is zoomed out we still render all of the data points!

Render a small sample of points

There are a number of possible methods to reduce the number of visible data points rendered. We'll use the simplest method: selecting only every kth point, and discarding all others. k is determined simply: if there are 5,000 points and we only want to show 100, then k is 50.

We'll add a maxPoints prop to CustomChart which will determine the maximum number of points returned by getData. All of the existing logic in getData will stay and another filter will be added afterwards:

function getData() {
const { zoomedXDomain } = state;
const { data, maxPoints } = props;
const filtered = data.filter(
(d) => (d.x >= zoomedXDomain[0] && d.x <= zoomedXDomain[1]));

// new code here...
if (filtered.length > maxPoints ) {
const k = Math.ceil(filtered.length / maxPoints);
return filtered.filter(
(d, i) => ((i % k) === 0)
);
}
return filtered;
}

Now the chart will always render at most maxPoints, no matter the zoom level.

Demo

Live View
Loading...
Live Editor

Extending this Demo

This guide serves as a start, but you might have some questions:

  • How big should maxPoints be? For most situations between 50 and 150 is ideal.
  • What if I want to render millions of data points? This concept can be extended to millions of points, but you'll need the help of a library to handle the sampling. Try Crossfilter.
  • Can I remove the "flicker" of points as I zoom in? Yes, but getData() will have to be a little more complex. This apparent movement of the points while zooming happens because different points are chosen to be displayed. Here is an example that reduces flicker by reliably choosing the same data points to display:
function getData() {
const { zoomedXDomain } = state;
const { data, maxPoints } = props;

const startIndex = data.findIndex((d) => d.x >= zoomedXDomain[0]);
const endIndex = data.findIndex((d) => d.x > zoomedXDomain[1]);
const filtered = data.slice(startIndex, endIndex);

if (filtered.length > maxPoints ) {
// limit k to powers of 2, e.g. 64, 128, 256
// so that the same points will be chosen reliably, reducing flicker
const k = Math.pow(2, Math.ceil(Math.log2(filtered.length / maxPoints)));
return filtered.filter(
// ensure modulo is always calculated from same reference: i + startIndex
(d, i) => (((i + startIndex) % k) === 0)
);
}
return filtered;
}