Skip to content

Use Spark 2.x to recognize handwritten digits (with initial focus on parsing scores from a golf scorecard)

Notifications You must be signed in to change notification settings

josephpconley/spark-digit-recognizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

spark-digit-recognizer

Use Spark 2.x to recognize handwritten digits on a standard left-to-right golf scorecard

Blockers

MNIST uses pixel values [0-255] as white to black and has their background in sharp contrast.
My images use pixel values [0-255] black to white, no sharp contrast in background.

Initial Steps

  • build a simple decision tree model which will predict handwritten digits from an image (67% accuracy)
  • take a sample image and convert to greyscale
  • take a score from your sample and use that as the dimensions you'll test for in the whole image (120 x 120)
  • test model against sample score
  • our 67% accurate model thought our 5 was an 8, maybe we should get a better model first?
  • [] use neural net model to improve accuracy
  • figure out how to scale your sample's score to 28x28 so the model can interpret it correctly
  • [] iterate over a few rows of your image and see if we can use the probabilities from the model to do OCR
  • [] try to use probabilities to parse out rows of numbers

Next

Maybe

  • [] try to handle top-bottom oriented scorecard

Questions

  • Why does the decision tree model keep predicting 8's?

TODO

  • [] easy way to convert probabilities into array double for writing to CSV

Sources

About

Use Spark 2.x to recognize handwritten digits (with initial focus on parsing scores from a golf scorecard)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published