---
title: "Analysis of Financial Time Series Data Using MariaDB ColumnStore"
publish_date: 2017-07-13
updated_date: 2022-08-26
author: "Satoru Goto"
tags:
  - name: "ColumnStore"
    url: "/resources/blog/tag/columnstore.md"
  - name: "finance"
    url: "/resources/blog/tag/finance.md"
  - name: "Financial Time"
    url: "/resources/blog/tag/financial-time.md"
  - name: "forex"
    url: "/resources/blog/tag/forex.md"
  - name: "industry"
    url: "/resources/blog/tag/industry.md"
  - name: "Linux"
    url: "/resources/blog/tag/linux.md"
  - name: "local cluster"
    url: "/resources/blog/tag/local-cluster.md"
  - name: "MariaDB Server"
    url: "/resources/blog/tag/mariadb-server.md"
  - name: "open source columnar database"
    url: "/resources/blog/tag/open-source-columnar-database.md"
  - name: "queries"
    url: "/resources/blog/tag/queries.md"
  - name: "Series Data"
    url: "/resources/blog/tag/series-data.md"
---

# Analysis of Financial Time Series Data Using MariaDB ColumnStore

[MariaDB ColumnStore](https://staging-mdb.com/products/technology/columnstore) is an open source columnar database built on [MariaDB Server](https://staging-mdb.com/products/technology/server). It can be deployed in the cloud or on a local cluster of Linux servers using either local or networked storage. In this blog post, I will show some examples of analysis of financial time series data using MariaDB ColumnStore.

## Free Forex Historical Data at HistData.com

First of all, we will download forex historical data ([GBPUSD M1 Full 2016 Year Data](http://www.histdata.com/download-free-forex-historical-data/?/metatrader/1-minute-bar-quotes/gbpusd/2016)) which is freely available on [HistData.com](http://www.histdata.com/download-free-forex-data/). HistData.com forex historical data look like:

```
20160103 170000;1.473350;1.473350;1.473290;1.473290;0
20160103 170100;1.473280;1.473360;1.473260;1.473350;0
20160103 170200;1.473350;1.473350;1.473290;1.473290;0
20160103 170300;1.473300;1.473330;1.473290;1.473320;0
```

1st column is timestamp of currency rate of every minute, but needed to convert the format, in order to fit with ColumnStore DATETIME data type. I modified the timestamp format using the following simple Ruby script :

```
convert.rb:
#!/usr/bin/env ruby
id = 0
while line = gets
  datetime, open, high, low, close = line.split(";")
  year, month, day, hour, minute = datetime.unpack("a4a2a2xa2a2")
  id+= 1
  print "#{id},#{year}-#{month}-#{day} #{hour}:#{minute},"
  puts  [open, high, low, close].join(',')
end
```

For example, you can convert the historical data as follows:

`ruby convert.rb DAT_ASCII_GBPUSD_M1_2016.csv > gbpusd2016.csv`

The converted forex historical data will be:

```
1,2016-01-03 17:00,1.473350,1.473350,1.473290,1.473290
2,2016-01-03 17:01,1.473280,1.473360,1.473260,1.473350
3,2016-01-03 17:02,1.473350,1.473350,1.473290,1.473290
4,2016-01-03 17:03,1.473300,1.473330,1.473290,1.473320

```

This demo was performed on following setup:

- CPU: Intel Core i7-7560U 2.4 GHz 2 physical cores / 4 logical cores
- OS: CentOS 7.3 ( virtualized on VMware Workstation 12 Pro on Windows 10 Pro)
- Memory: 4GB
- ColumnStore: 1 UM &amp; 1 PM on single node

Next, we create a database and a table with MariaDB monitor:

```
# mcsmysql
Welcome to the MariaDB monitor.  Commands end with ; or g.
Your MariaDB connection id is 4
Server version: 10.1.23-MariaDB Columnstore 1.0.9-1

Copyright (c) 2000, 2017, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or 'h' for help. Type 'c' to clear the current input statement.

MariaDB [(none)]> CREATE DATABASE forex;
MariaDB [(none)]> USE forex;
MariaDB [forex]> CREATE TABLE gbpusd (
    id INT, 
    time DATETIME,
    open DOUBLE,
    high DOUBLE, 
    low DOUBLE, 
    close DOUBLE
    ) engine=ColumnStore default character set=utf8;
```

Notes:  
With standard ColumnStore installation, MariaDB monitor command is mcsmysql ( aliased to /usr/local/mariadb/columnstore/mysql/bin/mysql )  
Storage engine must be ColumnStore

Now it’s time to import CSV data whose timestamps are corrected. For bulk import, we use cpimport command with ColumnStore. With -s (separator) option, we specify the delimiter character (in this example, comma).

```
# cpimport -s ',' forex gbpusd gbpusd2016.csv
Locale is : C
Column delimiter : ,

Using table OID 3163 as the default JOB ID
Input file(s) will be read from : /home/vagrant/histdata
Job description file : /usr/local/mariadb/columnstore/data/bulk/tmpjob/3163_D20170624_T103843_S950145_Job_3163.xml

…

2017-06-24 10:38:45 (29756) INFO : For table forex.gbpusd: 372480 rows processed and 372480 rows inserted.
2017-06-24 10:38:46 (29756) INFO : Bulk load completed, total run time : 2.11976 seconds
```

We could import 372,280 rows of forex data in 2 seconds. LOAD DATA LOCAL INFILE can be used as well with ColumnStore.

## Sample queries using functions

The reason why GBPUSD M1 data 2016 was chosen, is to track very quick moves of GBPUSD before and after the [Brexit vote](https://www.theguardian.com/politics/2016/jun/24/britain-votes-for-brexit-eu-referendum-david-cameron), which was held on 23th June 2016.

![gbpusd-h4.png](/sites/default/files/users/user82499/gbpusd-h4.png)

Now we look at maximum, minimum and drop off rate during 23th – 24th June 2016, using Windows Functions of ColumnStore.

```
MariaDB [forex]> SELECT MAX(close) FROM gbpusd WHERE time BETWEEN TIMESTAMP ('2016-06-23') AND TIMESTAMP ('2016-06-25');
+------------+
| MAX(close) |
+------------+
|    1.50153 |
+------------+
1 row in set (0.03 sec)

MariaDB [forex]> SELECT MIN(close) FROM gbpusd WHERE time BETWEEN TIMESTAMP ('2016-06-23') AND TIMESTAMP ('2016-06-25');
+------------+
| MIN(close) |
+------------+
|    1.32322 |
+------------+
1 row in set (0.03 sec)

MariaDB [forex]> SELECT MAX(close)/MIN(close) FROM gbpusd WHERE time BETWEEN TIMESTAMP ('2016-06-23') AND TIMESTAMP ('2016-06-25');
+-----------------------+
| MAX(close)/MIN(close) |
+-----------------------+
|    1.1347546399658233 |
+-----------------------+
1 row in set (0.03 sec)

```

GBPUSD fell from 1.50 to 1.32 (-13%) in 2 days (actually in half a day).

## Correlation GBPUSD – USDJPY on 23th June 2016

Next graph shows scatter plot of GBPUSD vs. USDJPY during 23th – 24th June 2016. (USDJPY M1 2016 were imported same as GBPUSD)

![scatter-plot-gbpusd-usdjpy.png](/sites/default/files/users/user82499/scatter-plot-gbpusd-usdjpy.png)

Note: GBPUSD is multiplied by 100, then subtracted by 130. USDJPY is subtracted by 110.

Now we calculate Pearson correlation coefficient using statistical aggregation functions. Following SELECT statement was used:

```
SELECT
  (AVG(gbpusd.close*usdjpy.close) - AVG(gbpusd.close)*AVG(usdjpy.close)) / 
  (STDDEV(gbpusd.close) * STDDEV(usdjpy.close))
AS correlation_coefficient_population
FROM gbpusd
JOIN usdjpy ON gbpusd.time = usdjpy.time
WHERE
  gbpusd.time BETWEEN TIMESTAMP('2016-06-23') AND TIMESTAMP('2016-06-25');
```

Query result:

```
+ ------------------------------------+
| correlation_coefficient_population |
+------------------------------------+
|                 0.9647697302087848 |
+------------------------------------+
1 row in set (0.57 sec)
```

GBPUSD and USDJPY were highly correlated during the UK EU membership referendum.

## Moving Average GBPUSD 23th – 24th June 2016

In Forex trading, moving average is often used to smooth out price fluctuations. The following query allows moving average of sliding 13 rows window.

```
SELECT 
  time,
  close,
  AVG(close) OVER ( 
               ORDER BY time ASC
                 ROWS BETWEEN 
                   6 PRECEDING AND
                   6 FOLLOWING ) AS MA13,
  COUNT(close) OVER ( 
               ORDER BY time ASC
                 ROWS BETWEEN 
                   6 PRECEDING AND
                   6 FOLLOWING ) AS row_count
FROM gbpusd
WHERE time BETWEEN TIMESTAMP('2016-06-23') AND TIMESTAMP('2016-06-25');
```

![GBPUSD-MA13.png](/sites/default/files/users/user82499/GBPUSD-MA13.png)

## Summary

In this blog post, the following features of MariaDB ColumnStore were explained:

- With cpimport command, users can easily perform fast bulk import to MariaDB ColumnStore tables.
- MariaDB ColumnStore allows fast and easy data analytics using SQL 2003 compliant Window Functions, such as MAX, MIN, AVG and STDDEV.
- No index management for query performance tuning needed.

MariaDB ColumnStore can be downloaded [here](https://staging-mdb.com/downloads/columnstore), detailed instructions to install and test MariaDB ColumnStore on Windows using Hyper-V can be found [here](https://staging-mdb.com/resources/blog/testing-mariadb-columnstore-windows-using-hyper-v).