hacked by NG689Skw

hacked by NG689Skw

New filtering and backtest in AlpacaScan


AlpacaScan screen

AlpacaScan has been updated with a newly-designed interface and customizable filtering functionality. This new feature refines the backtest analysis we have been evaluating for the past couple of weeks. Now you can clearly see how well each trading opportunity worked in the past using the statistical percentages found by the backtest calculation. AlpacaScan not only scans the entire stock market every day, but you can also filter the opportunities based on statistical analysis that is important to you such as winning rate or number of events, as well as fundamentals such as market cap and P/E ratio. You can even save your favorite filter under your account once you sign up and we will email the fresh results based on your favorite filter conditions every day at the market close.

Each opportunity is presented with a card like this one.


This opportunity for the stock symbol $KAR shows a Bullish Kicker candlestick pattern as of the last market close. This stock combined with this pattern moved upwards the following day 73.33% out of 45 times that the pattern appeared over the past 5 years. This is not a prediction or fancy magic but it does give you an idea of how well the stock performed with this pattern in the past. Remember, this is based solely on historical statistics coupled with recent candles, so you may want to look deeper into the opportunity by looking at the stock’s detail page to see a longer term chart and fundamentals to determine if it is the right trade for you.

The idea for AlpacaScan started when a trader we talked to was looking through AlpacaAlgo and said, “I know this particular candlestick pattern works well in many cases but as a human being, I can’t check all 7,000 stocks every day.”
We quickly built a small prototype that did the job for him. It scanned the 7,000 daily charts of the U.S. stock market and presented them with symbolic chart thumbnails. As it turned out, people loved it, despite of or thanks to the simplicity of the service. We then decided to add our data processing technology in order to augment the value of each opportunity with backtesting calculations, a user friendly design interface, filtering and customization features. Today’s Scan is still pre-MVP of everything we have in mind, but it’s definitely something our team loves (we trade based on Scan results a lot) and so does our community of casual traders. As casual traders, we feel there is a huge gap to fill between super active traders and buy-and-hold passive investors. It’s time for a movement to bring the modern world to outdated financial sectors similar to the one we have seen in e-commerce shopping and social media in recent years. Going forward, we are going to enhance the user interface, bring in the social interaction among users, innovate more tailor-made opportunity analysis technology, connect to your existing trading platforms, and most importantly, deliver more valuable opportunities you can act on.

We hope you  enjoy the new AlpacaScan!  As always, we are very happy to listen to any feedback you have, so please reach out to us in the chat window on the bottom right in Scan, on Twitter or by email (info@alpacadb.com).

With Over 10,000 Algorithms Built, Capitalico Advances Its Deep Learning Trading System, Now Called AlpacaAlgo

Alpaca’s deep neural net model has been newly designed to reduce the trading algorithm build-time from 10 minutes to a few seconds.

San Mateo, California, Sep 7th, 2016 — Alpaca, the leading AI startup in financial technology, releases its new deep learning engine today in its flagship trading system, AlpacaAlgo (formerly known as Capitalico).  The California-based fintech company also reveals that the number of algorithms built on AlpacaAlgo has exceeded 10,000 and about 100,000 trade alerts have been generated in only 6 months since it was first launched to the public in March of 2016.

Alpaca’s proprietary deep learning system learns how its users trade from their highlighted portions of historic candlestick charts. The neural net model has been newly designed to reduce the algorithm build-time from 10 minutes to a few seconds.  This cutting-edge AI technology allows AlpacaAlgo’s users to design trading algorithms faster and more interactively to help improve trading performance.  Its user interface has been redesigned as well to leverage the new neural net model’s advantages.


According to the deep learning company, who has worked on applying the AI technology in financial trading, the newly developed deep learning system for AlpacaAlgo is approximately 300 times more efficient than a naive implementation in terms of GPU memory usage, and can scale to hundreds of thousands of algorithms running in Alpaca’s system.
AlpacaAlgo’s new user interface also enhances the algorithm design workflow by versioning feature.  This allows one to revert to the previous version of algorithm’s performance to retrospect and tune the configurations more interactively.  Recently improved largely, the Portfolio feature calculates the algorithm’s trading performance in the real time live market as virtual money account, which helps users validate the the algorithm’s performance before investing the real money.  The company is working with online security brokers to launch the real money trading capability in AlpacaAlgo later this year.

Press release can be viewed AlpacaAlgo update press release.

Jibun Bank Partners with Alpaca to Build the World’s First AI Supported Foreign Currency Deposit Service

Jibun Bank customers to take advantage of Alpaca’s AI and Database technologies

Alpaca, the San Mateo based Fintech startup  behind “Capitalico,” a trading application that uses AI and deep learning, announced a partnership with Jibun Bank Corporation, a leading mobile-first bank in Japan. The partnership provides Jibun Bank’s customers access to Alpaca’s deep learning engine that runs numerous models among massive amounts of historical and real time market data in a foreign currency deposit service*.


Alpaca brings industry-proven AI and Big Data technology to the financial trading space by building an AI engine that efficiently understands each trader’s specific trade strategies based on time series price data and various technical indicators. Alpaca also utilizes a unique high-speed Big Data analytics platform for financial data, named “MarketStore,” and is based on the founders’ previous experiences working with Wall Street’s institutions**.


“We are proud to partner with Japan’s leading mobile-first bank, Jibun Bank,” said Yoshi Yokokawa, Alpaca’s co-founder and CEO. “We see a huge opportunity distributing Alpaca’s AI engine to a broader audience with different types of business partners.”


Alpaca’s AI and Big Data technology are being uniquely applied in the financial market through its SaaS “Capitalico.”


About Alpaca

Alpaca is a San Mateo based Fintech startup that uses AI and deep learning to create new trading technology. It was founded by the industry veterans in Database, AI, and financial trading. In March of 2016, Alpaca released a trade automation SaaS “Capitalico” to provide everyone the opportunity to automate their own trade ideas using AI and deep learning technology. For more information, visit http://www.alpaca.ai.


About Jibun Bank

Jibun Bank Corporation is a joint-venture direct bank between a leading Japanese telecom company, KDDI Corporation, and a leading Japanese commercial bank, Bank of Tokyo-Mitsubishi UFJ. Since its launch in June 2008, Jibun Bank has striven to become a financial institution with top customer satisfaction – a “personal bank for each individual customer” – providing high-quality financial services that are both convenient and secure Jibun Bank’s popularity has grown rapidly and by March 2016, the bank acquired more than 2,120,000 customers.

*: A foreign currency deposit service provides customers the option to make periodic deposits from other designated and linked currency accounts.

**: For detail explanation about Alpaca’s MarketStore, please see http://blog.alpaca.ai/next-level-of-time-series-data-storage-and-delivery-marketstore/

How I Enjoyed Fintech Internship at Alpaca, with Deep Learning for Trading Research

By Peter J. Zhang, an internship student at Alpaca

Prediction of Stock Prices

I clicked into Alpaca’s website by chance when I was looking for an internship late last year. I am a Ph.D student in theoretical physics in U.S.. I have a background of math and physics, some programming experiences but not formal training. I was seeking for a rather research-like job as my first internship in the state-of-art areas, namely deep learning, computer vision, AI etc.. So after a couple rejections from big names, I started looking for startups. Everything looks normal until I notice the strange puzzle in the recruiting page. I am exactly the kind of person who would never ignore such an interesting thing – it took me a couple hours to tackle the puzzle, and the moment I got the answer I thought I have to see whoever created this puzzle.

I had a lot of hesitations before actually arrived Tokyo. I don’t speak Japanese, I haven’t been to the country, not to mention work there. I don’t know how to rent a housing place in Japan and I have no idea how to behave in an appropriate polite manner that won’t causing troubles. Luckily the company solved most problems and arranged a comfort apartment for me.

The internship starts with a enthusiastic yet relaxing atmosphere, like the spring weather in Tokyo. After the first week for getting familiar with all the tools, I was asked to re-produce the result from a Natural Language Processing research paper (1) of market risk prediction using public news. As a Ph.D student, it is quite a familiar start for me. With the help from colleagues and the computation power of the server, I was able to finish the task in just a few days. For the first time I realized that an ordinary person is just not that far from “the frontier” – where quants from Wall streets came up with complex strategies to profit in the market. And this is exactly what the product Capitalico is trying to tell people: Bringing the technologies to everyone.

I have seen a lot of naive startups during college. Alpaca is definitely not like any of them. It has a strong tech-background personnel, with experienced developers and professional researchers from graduate schools; it has a clear commercial plan and deep connections with investors. It has a mixture of Japanese style and Silicon valley company culture. Quoted from our esteemed professional law buddy Ian, “Everyone knows what they are doing.”

I asked Yuki the chief engineer if I could be more involved in the website development. For the next two weeks, I went through behind the webpages, from the front end javascript browser notification to the back end algorithm of AlpacaScan. Each task takes a couple of days of researching and digging into the documentation, while is totally doable for an intern like me. While I was contributing to the code base, I was also able to study from the code style and infrastructure written by others.

Thanks to the shared code from Paul, Tomoaki and Jun-ya, the study process of deep learning went smoother than I thought. I remembered how amazed I was when I first saw the paper about “art style transferring” (the popular phone app which applies artistic filters). Now I am not only able to create one running on my own computer, but also extend the idea to other concepts, from generating natural languages to classifying different stock symbols.

During the last week of the internship, I was able to carry out a research on my own. It started with an ambitious idea: Can we predict the stock market, given enough information? I explored some results using neural network and got interesting results.(2)

Every morning on the way to the office, there are always thousands of suited-up people walking in and out from Tokyo station. On the other hand, we are probably the only people in casual clothes among them. There is no fixed working time or location, or strict office manners. Tomo-san always brought coffee and snacks, taking us out for tour bus or go to the bar after work. It feels more like a project group in college. I guess it is a major attracting point for people like me.

The internship granted me so much valuable experience, and also an extraordinary long trip traveling around Japan. I couldn’t have a better way to spend the 10 weeks anywhere else.


  1. Kogan, Shimon, et al. “Predicting risk from financial reports with regression.” Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, 2009.
  2. https://docs.google.com/presentation/d/1I-zaDYTDPp18DD3EX2YstkYprlc4PwFIQ6BNrQftgjE/pub?start=false&loop=false&delayms=3000

AlpacaScan: How Candlestick Patterns Reveal Trading Opportunities

Following the previous post where I explained the basics of the candlestick charts and shapes, I am going to walk through how you can discover trading opportunities by using candlestick patterns, specifically with AlpacaScan.


1.What is AlpacaScan?

AlpacaScan is service that we released last month, it picks up specific candlestick patterns and notifies you when a particular stock matches that pattern. You have the option to sign up for the daily email or you can simply check out our Twitter @capitalico; every day  5pm PT/ 8pm ET every day.

AlpacaScan screens through stocks and identifies which specific stocks match the below defined patterns.

スクリーンショット 2016-06-17 10.35.08
スクリーンショット 2016-06-17 10.36.09
スクリーンショット 2016-06-17 10.37.47
スクリーンショット 2016-06-17 10.40.05
スクリーンショット 2016-06-17 10.41.38


Testing results

Next we explore how useful candlestick patterns are. The below chart shows the results from backtesting Bullish Engulfing using AlpacaScan in the foreign exchange market. Each year having 10 to 13 occurrences of Bullish Engulfing, the percent likelihood of exiting with a profit after 1 week, 2 weeks or 4 weeks is indicated below.


Here, we see that in 2013, buying in during a Bullish Engulfing and exiting at 4 weeks or less yields a high probability of a profitable exit. In 2015 however, the probability of exiting with a profit was lower. When it comes to making trades, there is no single pattern that can can indicate market trends with absolute reliability. When interpreting candlestick chart patterns, it is important to analyze them in the proper context including other indicators, trends and market prices.


2.How to use AlpacaScan

As a self-directed trader, you are the one who decides ‘what’ to trade and ‘when’ to trade.  AlpacaScan has two popular uses to help you make those decisions.

One option is to select stocks identified by AlpacaScan from over 7,000 U.S. stocks that follow a specific candlestick pattern.. While it would be impossible for an individual person to check every price change every day; this task is fully within the capabilities of modern computing. Using cutting-edge technology in AI and deep learning and applying it to proven market analytics provides traders the unique benefit of staying grounded to the fundamentals while executing strategies at unprecedented volumes and complexity.

A second option is to use AlpacaScan to fine tune your medium and long term investments. For medium and long term stocks, you can develop highly specific strategies based on historical pattern trends of your stocks. Adding a backtest feature to this app is currently in development, so keep an eye out for it’s release soon!



Here is a real life example of how one of our members used AlpacaScan to make actual trades.

At first, he focused on Bullish Harami, a pattern which signifies a potential reversal of the downward trend. On May 17th, he analyzed the stocks following a Bullish Harami pattern, all of which were identified on AlpacaScan and communicated by email and Twitter.


He chose RRC, because the bearish first candle (in red) was not too long compared to the bullish second candle (in green) which had no visible upper shadow; this means that the highest price that day was the same as the closing price. Based on this analysis, he felt confident that the price would continue to rise.

AlpacaScan identifies and filters stocks that trigger specific logical conditions, therefore it is the stock’s performance that determines whether AlpacaScan will include it.

Even if a stock pattern meets the logical conditions, some patterns are stronger than others. In the case of SDRL, although it met the qualifying conditions of a Bullish Harami, the length of bearish first candle and the bullish second candle were almost same and therefore less ideal than RRC. Similarly, CJES was less ideal because the bearish first candle was too short compared to the bullish second candle. And the bullish second candle of CNX was at the high point, which can indicate that all stocks might be sold.

AlpacaScan also arranges stocks in order of volume size. In this case, there were 69 stocks that met the Bullish Harami pattern so it is helpful to be able to check based on volume size which helped him reduce his focus down to the four highest volume stocks.

To confirm, and as a matter of good practice, he also checked long-term price changes from finviz.


This chart shows the price changes of RRC since August 2015. With a simple glance, we can see that this stock has steadily risen since the beginning of 2016.

This stock pattern followed medium and long term flow and the reversal of the downward trend appeared, which confirmed that this was a good time to purchase this stock.

The below chart is the result of price changes.


As he expected, the prices rose.

The purpose of AlpacaScan is to scan over 7,000 U.S. stocks and identify only the ones that meet the criteria for identifying possible trend reversals. Therefore, it is now possible to be able to evaluate every U.S. stock, every day and only focus on the patterns we like. But always remember, before making any trade decision, due diligence and additional research must be performed. It won’t be long before AI can replicate trade decision analysis and make the decision to trade stocks including one like this past example.


3. Summary

Candlestick pattern is one of the most famous chart analysis. It is useful for deciding when you trade stocks and there is a great deal of information on how to make the most out of this information.

But always remember that candlestick chart patterns are a source of information and not the recommendation. The proper way to use this information is to help you make a trade decision based on other indicators and sources of information, but as one of the most valuable sources of information, it is well worth the time to learn about candlestick chart patterns.


Candlestick Patterns; The Basic of Stock Prices and FX Charts

The first time I looked at a stock price chart, I was confused by the shapes and design of what is called “candlestick charts.”  These kinds of charts are easily identified by their distinctive empty or filled vertical sticks.

candlestick pattern

These charts can actually be quite complicated. One new popular trading app, Robinhood for example, shows price changes as lines in order to avoid candlestick chart complexity. However, the complexity of the candlestick is what allows this style of charting to reveal so much information. This information is so valuable and interesting that candlestick patterns have been developed and analyzed since the 17th century in Japan.

Even as we specialize in advanced emerging technology such as AI and deep learning, we also make the most out of proven analytics such as candlestick patterns.  Last month, we released AlpacaScan, a daily notification that screens chart patterns based on popular candlestick configurations. Next we explain the basics of candlestick patterns.


What is a ‘candlestick’?

A candlestick displays price changes for an unit period, such as 1 minute, 5 minutes, 15 minutes, 1 hour, 4 hours, 1 day or 1 week.[1] It is called ‘candlestick’ because the shape resembles a candlestick.

スクリーンショット 2016-06-17 10.10.21

A candlestick pattern shows the high, low, open and close values. The thick portion of the candlestick that is either hollow or filled is called “the real body” while the long thin lines that resemble candle wicks above and below the body are called “shadows.”


Basic candlesticks

There are nine basic candlestick shapes.

スクリーンショット 2016-06-17 10.12.56

Although the shapes appear simple,they reveal a great deal of information about price changes quickly and visually. Although each candlestick  displays multiple data points ,  much more highly valuable information can be extracted from a set of candlesticks.


The history of candlestick pattern according to Wikipedia

Some of the earliest technical trading analysis was used to track prices of rice in the 17th century. Much of the credit for candlestick charting goes to Munehisa Homma (1724–1803), a rice merchant from Sakata, Japan who traded in the Ojima Rice market in Osaka during the Tokugawa Shogunate.[2]

In 1755, he wrote (三猿金泉秘録, San-en Kinsen Hiroku, The Fountain of Gold – The Three Monkey Record of Money), the first book on market psychology. In this, he claims that the psychological aspect of the market is critical to trading success and that traders’ emotions have a significant influence on rice prices. He notes that recognizing this can enable one to take a position against the market: “when all are bearish, there is cause for prices to rise”(and vice versa).

He describes the rotation of Yang (a bull market), and Yin (a bear market) and claims that within each type of market is an instance of the other type. He appears to have used weather and market volume as well as price in adopting trading positions.[3]


Academic proof

Japanese candlesticks differ from the typical North America bar chart style in the way it is visually represented. Although both identify the same points of open, close, high and low, the stylistic properties of the candlestick put a visual emphasis on the relationship between the opening and closing prices with the same day and within a larger pattern. Many chartists and technical analysts make use of these visual cues to better understand the market sentiment.

Candlestick pattern analysis however, is very complex, and the usefulness of a pattern depends on certain external factors. For example, because candlestick chart patterns have become so popular over the last several years, many financial firms have invested a great deal in deconstructing known patterns and developing algorithms to take advantage of that. The automation of trading means that market movement and sentiment can be captured much more quickly and alter the meaning of the candlestick in ways not previously conceived.

Despite the accelerated evolution in candlestick analysis, reliable patterns that stay true to the fundamentals continue to appear, allowing for short and long term profit opportunities.

There is a man who takes an in-depth look at 103 candlestick formations, from identification guidelines and statistical analysis of their behavior to detailed trading tactics. His name is Thomas Bulkowski and he published a bestseller, “Encyclopedia of Candlestick Charts”.[4]


At the next post, I am going to introduce AlpacaScan, and will walk you through how members at Alpaca use it to actually make trades.


[1] “Candlestick Definition’(http://www.investopedia.com/terms/c/candlestick.asp)
“Introduction to Candlesticks”(http://stockcharts.com/school/doku.php?id=chart_school:chart_analysis:introduction_to_candlesticks)
[2] “Wikipedia:Candlestick pattern”(https://en.wikipedia.org/wiki/Candlestick_pattern)
Wikipedia:Homma Munehisa”(https://en.wikipedia.org/wiki/Homma_Munehisa)


Alpaca presented in Plug and Play Fintech EXPO

Screen Shot 2016-05-25 at 11.47.20 PM

After the three months of incubation program, we got a great chance to talk about our progress with Capitalico, a platform to execute your trade with AI.  Thanks all for coming to today’s pitch and hope you enjoyed it!


Aside from our pitch, we feel fortunate being in the program with other startups including:

  • Skymind, which provides fraud detection by their deep learning framework
  • SkuChain, which offers blockchain technology for supply chain
  • DoubleNetPlay, which helps employees save and manage money
  • SnapCheck, which is trying to remove this insane paycheck system from US
  • DataSimply, whose AI is basically analyzing SEC filling for asset managers and analysts
  • TitanFile, which is yet-another file sharing service with the enterprise-level security in mind
  • NoPassword, which pushes MFA and other no-password solution
  • Token, which is saying to rebase the payment system as like World Wide Wed for information
  • and all other Fintech companies

And kudos to Token who won the award today.  This founder Steve Kirsch is one of the most famous serial entrepreneurs in the Valley who founded 7 companies including Infoseek and Token and had 2 exited for billion dollars.  We are very happy and proud to be part of this program and to talk with all of these Fintech companies that will change the financial industry!

Next level of time series data storage and delivery: MarketStore


(Photo by Jay Bergesen)

Alpaca’s products deal with the capital market data, majority of which are time series.  As our business expands from a niche market to a broader audience, we have realized some of the data challenges we have to face.  Some of them are publicly common, and others are very unique to us, but here is the high level data demands for the capital market applications.


  • Maximal throughput and lowest latency with limited resources
  • Scale with growing number of users and algos, as well as asset classes
  • Reliable delivery of market data to applications


Maximal throughput and lowest latency with limited resources

Speed is key.   If the chart loading is slow, people stop using the application.  We need to serve to thousands of people who dynamically scroll the intraday charts back to 10 years to take a close look at what would have happened in the market and their algos.  This contributes a lot in the user experience as well as reliable live test operation.  Time series data is notoriously hard to achieve the high performance in general-purpose data storage system as it requires lots of specific operations such as sorting and resampling.  And it’s true that we could spend a lot of money buying huge hardware to achieve the best performance, but that’s not what we want at Alpaca.  We utilize the best capability out of the minimum resources.  Aside from the computer resources, other important resources are our team members efforts.  Not to mention about managing the running software, we should be able to quickly do trial and error to develop new applications using the market data.


Scale with growing number of users and algos, as well as asset classes

We started from a niche market but are seeing more demands for the bigger use cases everyday.  The user base increases day by day and we envision serving to millions of people eventually. Even if we are talking about only tens of thousands of users, with each typically building handfuls of algos, the number of algos we host in our platform reaches to the order of magnitude of hundreds of thousands of algos.  What kind of hedge fund or investment banks in the world ever ran as many trading algos as we do simultaneously all the time? Remember each algo is like you reading the chart every minute, so the data needs to be served without any issues to that many algos.  And while the number of algos ever increases, we are also expanding our data coverage from a handful currency pairs to more than ten thousand stock symbols, futures, bitcoins and ETFs across the world.  We don’t want to stop growing our business due to this data scalability issue.


Reliable delivery of market data to applications

Data is like sashimi.  It’s best when it’s served as fresh as possible.  As described above, we are hosting so many algos that have their mouths open all the time to catch the fresh data coming out of the market.  The appropriate data should be delivered to the right consumer at the earliest time, otherwise algos will not be able to generate meaningful results.  Since our neural network-based algos are super computationally heavy, we always distribute the tasks to many computers and shuffle them to utilize the best computing resources.  And what’s more important is to keep the data close to the computing.  On the other hand, we have to watch the market all the time and get the latest information of any of those the ever growing asset classes.  Finally this is a system that deals with the real money.  If we fail to deliver the true data to the right calculation, our business will lose the confidence.


How and why are we approaching it now?

We wrote a few hundred lines of python that does the job for the current size of data last year to start our closed beta program, but we knew it wouldn’t be able to scale up to what we wanted it to be.  That’s not bad, as we always start from small and see how big it needs to be, as a small startup ourselves.  But we are realizing the time had already come to revisit this problem.


So, a couple of months ago we sat together and had a long discussion about how we would tackle this capital market data problem.  Chris, who recently joined the team, brings his experiences from mission-critical defense system and IBM/DB2 development.  Luke used to build the high performance computing cluster back in 90s for numerical simulation and later built a company and database called Greenplum, which became one of the most successful Big Data databases.  I built many data-driven applications before joining Greenplum when it wasn’t called Big Data or Data Science, and later I architected the Greenplum database engine.  During that time, we worked with many customers in the financial sectors and got to know a lot about what kind of problems this capital market applications are facing.  Out of those experiences, we came up with this system that is now called MarketStore.  It is designed specifically to solve our problems described earlier, by incorporating modern technologies.


Design points

It is still too early to finalize the design and we continue to improve it as we better understand the data demands in our platform, but basically the key design points are as follows.

  • 100s of thousands of timeseries to serve and update real time (10k+ symbols * 5-6 timeframes)
  • more than 10 years of second-level data
  • 100s of thousands of concurrent clients (algos) for both historical query and real time pub/sub model
  • up to seconds of update frequency for each of these time series
  • sub-milliseconds latency
  • with minimal hardware resources
  • highly available, with minimal manual intervention

Basically these requirements say that it is a mix of fast data and big data, which should typically be separated into different products.  Since it is hard to satisfy those different requirements at the same time, one product had to focus on one not on both.  In that sense, I can see these points are challenging from my database development experiences.  The reason MarketStore can do the both jobs is because it is optimized for our financial application purposes.


It is written mainly in Go to overcome the previous concurrency and memory problems from Python’s interpreter.  It is designed with the modern hardware/software such as SSD and the sparse file support in mind, while it persists the data with global object storage.  Queries from clients are processed in a similar manner to typical databases going through the parser/planner/executor model.  The client interface employs HTTP (currently 1.1 but could expand to 2.0) including WebSocket.


Now we have this MarketStore to expand our business to US equity and broader range of data to be managed.  While running our products, we are also listening to our users and are keen on building things that solve our customers real problems.  Thanks to this data store, our application development has become so easy and we are able to iterate with hypothesis to verify our idea in short cycles.  It may sound easy to build such applications around the market data, but it would be much much harder without MarketStore to fetch one symbol of one timeframe data out of tens of thousands of symbol/timeframe pairs that updates very frequently.



You may ask me why we are building a new data storage even though there are so many existing data products out there.  Let’s look at one by one.


Relational Database

RDB such as PostgreSQL and MySQL are viable solutions and I like them.  That said, it is hard to optimize the data for the time series application since relational algebra doesn’t have a notion of orders in dataset.  And we didn’t need SQL.



HDFS is a good foundation of any kind of modern data platform.  It is especially good if you want to throw gigantic data and forget about it, but we pretty frequently have to retrieve majority of the data and for that purpose we have S3 anyway.  MapReduce and Hive’s latency is not in our scope at all, and Spark would be good if we want to run k-means or something over large data, but we don’t.  HBase is more for write-oriented than our read-oriented operation and none of these solutions provide time series capability with low latency delivery.  I can list many arguments in this space, but most importantly, there are just too many products and managing a Hadoop cluster in our team size was simply not the choice.


Message Bus

Yes there are good solution for the message delivery such as Kafka and RabbitMQ.  They are built for that purpose.  It is highly available and easy to manage.  We could utilize it but also it’s not hard to build one.  Especially in this sector, the newly delivered data should be persisted and should be consistent with the following history query.  Message Bus is not designed to keep the data, and we anyway needed to build the storage part, and we just didn’t need a separate messaging system thanks to Go’s concurrency support in its language and runtime.



DataFrame is great, and in fact, we started from it.  It was originally developed by the quants guys so it totally makes sense it has got a full set of time series functionality for this financial data application.  It’s just that Pandas wasn’t meant for the server side backend system.  We are still using DataFrame in some of our system and the MarketStore clients can build local DataFrame objects from the server response, but it is also one of our immediate goals to replace all DataFrame persistency with MarketStore.



You may have heard about this relatively new open source database, also written in Go.  It is a great aim to address time series database problems in the community with an open approach, and I like that.  Though, it seemed to me it was best designed for system metrics data (and possibly for IoT kind of application), which isn’t best for our financial data.  When it comes technical analysis in the financial application, there are many windowing operations such as moving average and other derivative indicators.  That kind of extensibility in the query was not there when I looked at it (I guess it’s getting there these days) and also it doesn’t have the seamless integration between historical query and real time message delivery.



If you have ever been in the financial industry as a DBA, you should probably know this unique database that’s actually quite popular in that sector.  It’s a columnar, in-memory database that is designed specifically for the financial application.  It is, however, a commercial product and runs on as a single node database that doesn’t scale, so you would have to have an expensive hardware with sizeable RAM on it.


So, although there are really many different data products around the world, I didn’t think any of them could help us solve our problems, and I am guessing that’s the case for many other people in this sector.  That’s one of the reason we are putting our effort in this storage.



I summarized the current state of our financial data storage called MarketStore.  I am sure this is an interesting topic to many engineers, and I would definitely look forward to hearing any kind of feedback or questions.  Although we invest our resources in this component as we need it, this is not our main business, so we also may think about open sourcing it once it gets ready, but for now we will incubate it in our internal server, so stay tuned…


Hitoshi, CTO of Alpaca

Yoshi presented at TiECON as a TIE50 Finalist!


Yoshi presented Capitalico and its backend technology at TiECON as one of TIE50 Top Startups last Friday on May 6th in Santa Clara. TiECON has been running for more than 20 years in Silicon Valley, and it is our honour to be selected one of the Top Startups.