Splunk Inc.

08/09/2022 | News release | Distributed by Public on 08/09/2022 08:58

Splunk Tools for Fun and Profit: A Hacker's Guide to Casino Blackjack

Share:
By David Holiday August 09, 2022

* DISCLAIMER: Gambling involves inherent financial risk. This post is intended for entertainment purposes only, as are the datasets provided herein. If you think you or someone you love has a problem with gambling, contact the National Council on Problem Gambling Help Line at 1-800-522-4700.

The same tools a hacker employs to overcome obstacles are used every day by Advantage Players (APs) to beat the game of casino Blackjack. What in hacker parlance is referred to as social engineering is the means by which APs endeavor to remain undetected by casino staff. In the same way a hacker will perform reconnaissance and learn everything there is to know about their target, an AP will know what games each casino offers and will fully understand the math of those games before making their first bet. Both employ skill, guile and determination but to different ends.

The hacker endeavors to overcome an obstacle in a computer system; the AP seeks to transmute a game designed to empty your wallet into one that fills it.

This post will focus on how Splunk tools can be used to transform Blackjack data into fun and profit.

Table of Contents:

Background

For those unfamiliar with Blackjack, a brief and woefully inadequate primer:

Blackjack is a casino game wherein the player and dealer are each given two cards. One of the dealer's cards is face down, the other is up. Each card has a value. Numbered cards are valued at their number. Face cards are valued at ten. Aces can be either one or eleven, at the player's discretion. The goal of the game is to compose a hand that is valued at as close to twenty one as possible without going over. If you go over, you lose. If the dealer gets closer to twenty one than you do, you lose. For a more complete description, see this.

Unlike many games in a casino, Blackjack is a game wherein the probability of what happens next is a function of what has already happened. There are a finite number of cards of each type in the shoe and, unless the casino is using a Continuous Shuffling Machine, the shoe is not reshuffled after each hand. This means that by paying close attention to what cards have already been played, an AP can make a better-than-chance prediction as to what cards are likely to be played in future.

This is the basis of Card Counting, the technique we will use Splunk offerings to explore.

A word on the legality of Card Counting before we dive in: It has long been established in the United States that Card Counting is a legitimate, legal, manner of play. It also has long been established in the United States that casinos, which are on private property, have every right to consider you a trespasser and ask you to leave at any time for almost any reason. A patron who can reliably take their money is one such reason. Unfortunately, there are numerous instances wherein the manner in which a patron was "asked" to leave was, shall we say, unpleasant. See "The Law for Gamblers: A Legal Guide to the Casino Environment" by gaming lawyer Robert Nersesian for details as to case law and the various means by which casinos have exercised their right to remove patrons from their premises.

In many jurisdictions it is also ILLEGAL to use mechanical devices to assist in card counting. So, while this post explores how Splunk logic can be applied to counting cards, you should be aware that using Splunk software, in any device, on the casino floor is likely illegal.

Setup

In order to use Splunk software to analyze Blackjack data we first need data. I generated that data using a Blackjack simulator I wrote for this blog post. You can find that simulator, the data we will use for the analysis and the Splunk dashboard code here.

The data we will be working with is in blackjack_simulation_files.zip(direct link). If you do not have Splunk entitlements and want to try things out with the provided data, you can use the Docker-Splunk project here.

While a treatise on the inner workings of the simulator code is out of scope, we should take a moment to describe what exactly was being simulated. Blackjack can be played with many rule variations. In this case we set up a game of Blackjack wherein:

  • Blackjack pays 3:2
  • The shoe has six decks
  • Player is permitted to double down on any first two cards
  • Player is permitted to re-split to up to four hands
  • Player is permitted to surrender

Additionally, the simulator employed seven player agents, each behaving differently. All but Player 7 employed perfect Basic Strategy. Player 7 was programmed to only buy into the hand and never take any cards. Players 2 to 6 were programmed to count cards. The players that counted cards did so according to different techniques and bet spreads - meaning they varied their bet depending on the degree to which their counting scheme indicated they had an advantage.

Here is a breakdown of all the player agent behavior:

PLAYER

PLAY STRATEGY

COUNT STRATEGY

BET SPREAD

ONE

BASIC STRATEGY

NONE

1:1

TWO

BASIC STRATEGY

SPEED COUNT

1:8

THREE

BASIC STRATEGY

SPEED COUNT

1:12

FOUR

BASIC STRATEGY

RE-KO

1:5

FIVE

BASIC STRATEGY

RE-KO

1:6

SIX

BASIC STRATEGY

RE-KO

1:7.5

SEVEN

ALWAYS STAND

NONE

1:1

There are two runs of data we will explore. The first is 16,000 batches of 1,000 hand sessions of Blackjack. The second is 16,000 batches of 10,000 hand sessions of Blackjack. This means both data sets have 16k data points indicating the bankroll outcome for all seven players after the batch of hands is complete.

The batch sizes were picked to simulate ten hours of play, and one hundred hours of play, respectively. In other words, the outcome of a few days in Vegas vs the outcome of what a serious hobbyist could do in the span of a year.

In the zip file there are two json files which represent the data we'll be exploring. Note that, due to the size constraints of a blog post, I used a jupyter notebook to transform the data into something I could ingest into Splunk software without setting up a forwarder from the simulator to Splunk software.

Let's Get Splunky!

The first thing we need to do is get data into Splunk tools. From the landing page click on "Add Data".

Now click "Upload". Select one of the two json files.

You can click "Next" until the file is uploaded.

From here click "Start Searching".

And you should see something that looks like this.

Success!

Before we proceed to the dashboard let's take a moment to examine one of the commands the dashboard uses to extract data. Append the following to the search command:

| foreach PLAYER_STATS.PLAYER_*._* [eval PLAYER_STATS.PLAYER_<>._BANKROLL_DELTA='PLAYER_STATS.PLAYER_<>._END_BANKROLL'-'PLAYER_STATS.PLAYER_<>._START_BANKROLL']
| stats stdev(*BANKROLL_DELTA) AS *BANKROLL_DELTA by host

There are two commands at work:

  • ForEach allows us to iterate over elements and, in conjunction with the eval command, create a new field BANKROLL_DELTA. Our analysis will focus on how much money each player ends up with after each batch. Because that field isn't in the data set we imported into Splunk Enterprise, we leverage the foreach command to create it.
  • stats empowers us to calculate things like standard deviation and mean. In this case we are computing the standard deviation of BANKROLL_DELTA. The "by" clause groups one or more fields. In this case, we're grouping all the fields by host value.

When all is said and done, you should see something like what's below when you execute the search command:

Putting It All Together in a Dashboard

Before proceeding any further you need to have both 1k-rounds_5k-bankroll_15_ordered_w_index.json and 10k-rounds_5k-bankroll_15_ordered_w_index.json imported into Splunk Enterprise. They are both found in the blackjack_simulation_files.zip file.

Included in the zip file is the source code for a dashboard I created that uses SPL commands like the one above to create visualizations of the data. Click on "Dashboards" on the menu bar then click the green "Create New Dashboard" button.

Call it whatever you like, but you need to select "Dashboard Studio" on the right in order for this to work. It doesn't matter if you select "Absolute" or "Grid" layout mode. Once you've done so click the green "create" button.

You should see a blank dashboard. Click the "source" icon on the menu bar. It is the one on the far right of the icon set and is depicted in the screenshot above with the label "source". Cut and paste the contents of blackjack_dashboard.json into the window. You want to replace the current dashboard json with the contents of the aforementioned file. When you're done, click the gray "Back" button on the right side of the window.

You should now have a fully populated dashboard like the one below.

What Does It All Mean?

What you're looking at is three comparisons using the data we imported into Splunk Enterprise.

The first pair is the bank roll delta - the amount of money gained or lost by each player - per round. If your top two graphs look like fuzz, you can zoom in on a subset of the data by left clicking on your mouse to draw a box around the slice of data you want to inspect more closely.

The second pair is the mean bank roll delta across all batches in the data set.

The third pair is the standard deviation of the bank roll deltas across all batches in the data set.

Immediately what stands out is Player7 had a bad time. In nearly every instance they lost their entire bankroll. One insight from this data is that standing on every hand in Blackjack is a sure-fire way to lose all your money.

Player1, the flat betting Basic Strategy player, had by far the smallest mean bankroll gains. However, as with everyone but Player7, the standard deviation was quite high. In both sets of data for Player1, sigma was about a third of their bankroll. In plain English, the data indicates there is a:

  • 68.2% chance Player1 will end up +- ~1/3rd of their bankroll
  • 95.4% chance Player1 will end up +- ~2/3rd of their bankroll
  • 99.7% chance Player1 will end up with +- all of their bankroll

This for both ten hours of play (a trip to Vegas) and one hundred (more Blackjack than most of you will play in your entire lifetime).

This, you might say, is expected as Player1 is not card counting. Surely the card counting players are reliably earning more, right?

Kinda.

In every case, the data indicates the variance for the card counting players was substantially higher than for the Basic Strategy Player (Player1). So while the mean bankroll delta was more substantially positive for all the card counting players than the two that weren't, the standard deviation grew as well.

In plain English: the data indicates that under the conditions of the simulation the players that employed card counting often lost some or all of their bankroll. The highs for them were higher, but so too were the lows.

Conclusions

One thing I take from this data is that card counting may not be worthwhile if you only play Blackjack once in a while. The variance in the game is too high for there to be reliable gains given only a small number of sessions. Additionally, and this surprised me, was the degree to which a bankroll of $5k, a relatively small betting unit of $15, and ten thousand rounds, did not prove sufficient to demonstrate the reliability of the card counting techniques employed by the simulation agents in growing their bankroll.

Before anyone starts flipping tables, allow me to explain. The excellent work in Blackjack mathematics by people far more capable than your humble author, people like Don Schlesinger, Stanford Wong, Norm Wattenberger, et al, consistently shows the efficacy of card counting as a means of growing an APs bankroll over time. In fact, simulation runs of 100k rounds (~1k hours of play) with a sufficiently large bankroll show counting does work in the long run.

At issue is:

  1. How many hands of blackjack does one need to play before the statistics can be relied on to work in your favor
  2. How large does your bankroll need to be with respect to your betting unit in order for you to be able to sustain the substantial variance inherent in the game.

What I found surprising - again given this data - is that $15BU/$5,000BR for 10,000 hands was not sufficient for the outcome to be consistently good for the players employing card counting.

This leads me to the second and perhaps most important conclusion. It's still gambling. Yes it's gambling with an edge. Yes, the math is in your favor. However, and again according to this data, it's not a means by which you can treat the casino like an ATM machine. The scale of hours you need to put into playing coupled with the scale of bankroll you need in order to make the math work reliably in your favor is high to the point of being cost prohibitive to most people. And don't forget, before the odds break in your favor, the casino may legally ask you to leave.

I hope you've had as much fun reading this and playing with the dashboard as I did putting it all together. I'd like to thank Richard Munchkin for helping me get some background on Speed Count, my colleagues at Splunk (particularly Jeff Spencer for his help with the SPL, Ningwei Liu for his help with some of the stats, and Abhey Kohli for both his insights into Splunk data ingestion and for letting me do this crazy thing in the first place) and my loving and patient wife for listening to me rant about this for over a month.

Next time… standing up direct ingestion of data from the simulator to a Splunk offering, analysis of the value-add of count-based deviations, and more!