Implementation of Deep Belief Networks #3

jfsantos · 2014-12-12T20:32:05Z

This is a cleaned up version of my DBN implementation using the RBMs in Boltzmann.jl. For now it is a really simple extension, as it justs adds a new type DBN and a function to fit it, as well as a helper function to compute the mean of the hiddens at a given layer. The user can only change the type of the first RBM because in most of the applications I've seen all the upper layers are Bernoulli RBMs, but this can be easily changed if needed.

I also added an example that uses the MNIST dataset to train it. The HDF5 file is generated by this script from the Mocha.jl package. I think we could test it with a simpler dataset and add that test at a separate folder (e.g., examples/).

dfdx · 2014-12-13T01:30:54Z

test/testdbn.jl

@@ -0,0 +1,9 @@
+using HDF5, Boltzmann
+
+f = h5open("/Users/jfsantos/.julia/v0.3/Mocha/examples/mnist/data/train.hdf5")


There's separate package for this dataset - MNIST.jl - so no need to read data manually. The only possible detail is that you may need to scale X to a range [0..1], since RBMs behave really bad with values larger than 1.

Sure, I can do that. The HDF file generated by the referenced script is already normalized in the correct range.

dfdx · 2014-12-13T01:53:05Z

Thanks for contributing it! DBNs are a logical continuation of RBMs, but I never had time to implement them. Few details that I was thinking of:

We should be able to pass parameters to RBM constructors (and possibly individual fit() calls on them). Probably the easiest way to achieve this will be to pass initialized layers similar to Pipeline from SciKit Learn.
Though this package is not intended to be superseded or replaced by Mocha.jl (e.g. I have successfully used pure RBM on sparse data for recommendation engine, which Mocha is really not designed for), some integration with it is really welcome. I especially like their replaceable backends, which simplify writing code for CPU and GPU a lot. On other hand, as far as I know, they still miss belief networks, and we can fix it. Right now I'm more busy with some classification algorithm packages, but taking a closer look at Mocha is definitely on my TODO list.

jfsantos · 2014-12-13T03:56:30Z

Regarding 1, I think it is definitely important. We can write an improved constructor and fit functions to do this. The fit function for DBNs could take a list of arguments to be passed to each layer's fit call, for example.

I am trying to contribute to Mocha as well, and was thinking about adding replaceable backends to your RBM implementations. Basically, the computing-intensive functions from layers get a Backend instance as an argument, and dispatch depending on the type of this argument (e.g., you will have forward(b::CPUBackend, layer, X) and forward(b::GPUBackend, layer, X)). We could do pretty much the same thing for RBMs.

It would be interesting to add some integration to Mocha, even though their "philosophies" are a bit different (which makes sense, as training algorithms for RBMs and belief networks are a bit different from those used for feed forward nets). We could start automating the process of performing unsupervised training of a DBN and then converting it to an MLP for supervised fine-tuning. This is exactly what I am doing now for my project, so I'll see if I can come up with a draft implementation.

Implementation of Deep Belief Networks

dfdx · 2014-12-16T22:30:23Z

@jfsantos If you don't mind, I changed test to use MNIST package instead of loading file from Mocha directory.

jfsantos · 2014-12-16T22:35:16Z

Sure, I think that is the way to go, as MNIST.jl already includes the data and does not require manually running a script as Mocha.

On Dec 16, 2014, at 5:30 PM, Andrei Zhabinski [email protected] wrote:

@jfsantos https://github.com/jfsantos If you don't mind, I changed test to use MNIST package instead of loading file from Mocha directory.

—
Reply to this email directly or view it on GitHub #3 (comment).

pluskid · 2014-12-21T01:56:06Z

Hi, I'm the author of Mocha. I agree that some integration of the two packages will be really nice for the community. For example, the immediate thing I could think of is to use Boltzmann.jl to initialize weights for DNN that get fine-tuned in Mocha.jl. I think this should be relatively straightforward if you export the trained weights to HDF5 file and ask Mocha to load that weights as initialization. Mocha is already using this kind of mechanism to load models trained by Caffe. The HDF5 file Mocha reads has a simple format: see here: http://mochajl.readthedocs.org/en/latest/user-guide/tools/import-caffe-model.html#mocha-s-hdf5-snapshot-format

Of course, we could discuss about the data format if needed. :)

dfdx · 2014-12-22T22:51:43Z

@pluskid I believe HDF5 will work fine. I'll have a long weekend starting from Thursday to spend on learning Mocha (finally) and try to implement this kind of exporting. Meanwhile, is there an example of converting Julia arrays to Mocha-compatible 4D tensor?

pluskid · 2014-12-23T01:38:54Z

@dfdx Starting with the last version (v0.0.5) Mocha actually support ND-tensor. And an ND-tensor (Blob) is essentially a (shallow wrapper of a) Julia array Array{Float64, N} (or Float32). So if you are talking about: you have a julia array, and want to save to an HDF5 file so that Mocha can read, then there is no conversion need. Except that Mocha only support either Float32 or Float64 because BLAS only support those.

For example, the weight blob of a InnerProduct layer is a 2D-tensor (matrix) of the shape 'P-by-Q', where P is input dimension, and Q is the target dimension. So essentially rand(Float64, (P,Q)) could possibly be a valid initialization for the weights parameters.

If you are interested, there is a bit document about Blob (ND-tensors) in Mocha: http://mochajl.readthedocs.org/en/latest/dev-guide/blob.html

dfdx · 2015-01-03T12:49:49Z

I've added export to Mocha as a part of DBN redesign

jfsantos added 2 commits December 12, 2014 12:34

Moved DBN code from separate package to Boltzmann.jl

8b68931

Fixed code and added a test based on the MNIST dataset

dfb19fa

jfsantos mentioned this pull request Dec 12, 2014

Support for deep belief networks #2

Closed

dfdx reviewed Dec 13, 2014
View reviewed changes

dfdx added a commit that referenced this pull request Dec 16, 2014

Merge pull request #3 from jfsantos/dbn

0e925aa

Implementation of Deep Belief Networks

dfdx merged commit 0e925aa into dfdx:master Dec 16, 2014

jfsantos mentioned this pull request Dec 20, 2014

Roadmap pluskid/Mocha.jl#22

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implementation of Deep Belief Networks #3

Implementation of Deep Belief Networks #3

Uh oh!

jfsantos commented Dec 12, 2014

Uh oh!

dfdx Dec 13, 2014

Uh oh!

jfsantos Dec 13, 2014

Uh oh!

dfdx commented Dec 13, 2014

Uh oh!

jfsantos commented Dec 13, 2014

Uh oh!

dfdx commented Dec 16, 2014

Uh oh!

jfsantos commented Dec 16, 2014

Uh oh!

pluskid commented Dec 21, 2014

Uh oh!

dfdx commented Dec 22, 2014

Uh oh!

pluskid commented Dec 23, 2014

Uh oh!

dfdx commented Jan 3, 2015

Uh oh!

Uh oh!

		@@ -0,0 +1,9 @@
		using HDF5, Boltzmann

		f = h5open("/Users/jfsantos/.julia/v0.3/Mocha/examples/mnist/data/train.hdf5")

Implementation of Deep Belief Networks #3

Implementation of Deep Belief Networks #3

Uh oh!

Conversation

jfsantos commented Dec 12, 2014

Uh oh!

dfdx Dec 13, 2014

Choose a reason for hiding this comment

Uh oh!

jfsantos Dec 13, 2014

Choose a reason for hiding this comment

Uh oh!

dfdx commented Dec 13, 2014

Uh oh!

jfsantos commented Dec 13, 2014

Uh oh!

dfdx commented Dec 16, 2014

Uh oh!

jfsantos commented Dec 16, 2014

Uh oh!

pluskid commented Dec 21, 2014

Uh oh!

dfdx commented Dec 22, 2014

Uh oh!

pluskid commented Dec 23, 2014

Uh oh!

dfdx commented Jan 3, 2015

Uh oh!

Uh oh!