Add MaxPool1D layer #96

mkaze · 2021-06-02T17:18:35Z

Resolves #60. This PR adds support for MaxPool1D layer.

Implement layer class
Add unit tests for layer implementation
Add support for importing models containing this layer from JSON config files
Add support for exporting models containing this layer to JSON config files

According to the point I mentioned here, unlike MaxPool2D or Conv2D implementation, I haven't used a full-sized array (i.e. including batch and features) to specify pooling and stride size. Please let me know if this is not desired.
Clearly the tests could be refactored to factor out their common parts; however, I thought it would be better to do it in another PR which hopefully would implement a mechanism for feeding data to pooling (or even all) layers in a TF session and test their output.

zaleslaw · 2021-06-09T15:32:02Z

api/src/main/kotlin/org/jetbrains/kotlinx/dl/api/core/layer/pooling/MaxPool1D.kt

+ * @property [dataFormat] Data format of input; can be either of [CHANNELS_LAST], or [CHANNELS_FIRST].
+ */
+public class MaxPool1D(
+    public val poolSize: Int = 2,


Please use 3d long arrays here before we find the final solution in our discussion

zaleslaw · 2021-06-09T15:32:09Z

api/src/main/kotlin/org/jetbrains/kotlinx/dl/api/core/layer/pooling/MaxPool1D.kt

+    public val poolSize: Int = 2,
+    public val strides: Int? = null,
+    public val padding: ConvPadding = ConvPadding.VALID,
+    public val dataFormat: String = CHANNELS_LAST,


remove the data format

zaleslaw · 2021-06-09T15:32:48Z

api/src/main/kotlin/org/jetbrains/kotlinx/dl/api/core/layer/pooling/MaxPool1D.kt

+         * NOTE: we can use `MaxPool.Options` argument of `tf.nn.maxPool` to pass
+         * the data format, as follows:
+         * ```
+         * val tfDataFormat = if (dataFormat == CHANNELS_LAST) "NHWC" else "NCHW_VECT_C"


important note

…st changes

mkaze · 2021-06-10T14:53:16Z

@zaleslaw I just merged it with master branch and updated the implementation.

api/src/main/kotlin/org/jetbrains/kotlinx/dl/api/core/layer/pooling/MaxPool1D.kt

mkaze · 2021-06-11T13:52:50Z

@zaleslaw I changed the strides type and default value per your request.

zaleslaw · 2021-06-11T14:01:04Z

LGTM, could be merged, I've run TC

mkaze · 2021-06-11T14:52:28Z

@zaleslaw By the ways, I was wondering why running the tests on CI server take so long to complete. Is it due to the CI server, or are there a lot of tests? Although, those tests which involve fitting a model on a sample data might take longer, but I guess probably there is room for improvement to optimize the them.

zaleslaw · 2021-06-11T15:17:17Z

@mkaze on my laptop takes around 1,5 hours (16 GB RAM, modern CPU), but it takes 2~3 hours on CI.
Of course, it could be reduced, especially when the functionality will be covered with the unit tests better when now.

I could say that I don't try to reduce test/examples time, and it's a good checker with good coverage, which helps to find something which breaks something else.

Another problem is the good asserts for the trained models: for example, we want to reach accuracy 0.7 ~0.8 on the task, but if we will run all trainings for the 1 epoch, it gives us unstable results (due to TF non-determinism, we have no full control on JVM side, unfortunately), if I run tests for 3+ epochs it multiplies running examples time on 3.

I suppose that some of the longest examples could be excluded from the Gradle task examples:test or should be parametrized with number of epochs/batch size) and be different for one code snippet when it runs as an example or as a test.

There are many new tests added during the last two weeks, and I'll revisit the current tests and examples and describe the goal of testing for this project.

So, could I ask you, what exactly bothers you that tests take a long time to run?
It seems to me that we are working quite quickly and efficiently. It seems to me that on many open source projects related to data processing and machine learning, tests take a long time, sometimes up to ten hours, i.e., I wanted to say that we have nothing unusual here, and even having optimized the tests, we are unlikely to be able to reduce their time by order of magnitude, ten or more times.

In any case, you have many valuable suggestions, including this, which makes the project more mature, and share your perspective, which I really appreciate.

mkaze · 2021-06-11T16:31:16Z

@zaleslaw Thanks a lot for the explanation. Yeah, I agree that it's not unusual to have tests take hours to complete especially in projects concerned with data processing. I was just thinking whether we can further improve the tests so that the development cycle duration could be reduced, i.e. we can get a feedback whether the changes are correct in a shorter period.

I assume that the majority of test time is consumed by those tests which involve fitting a model on a dataset and validating whether the model has actually learned something (i.e. it reaches a reasonable train/validation accuracy). So, just as an example of a potential improvement, we can consider the possibility of reducing dataset size (e.g. using 10% of MNIST instead of all of it) or even using a small synthetic dataset, maybe not for all the tests but at least for a good fraction of them. Of course, this should be given further thought to evaluate its pros and cons.

All in all, it sounds fine to me. As I said, I was just interested if this feedback loop could be further optimized and improved (just for the sake of efficiency!). I haven't yet taken a look at all the unit tests; so I would definitely take a look at them and if anything useful came to my mind, I would certainly share it with you. Thanks again!

mkaze and others added 4 commits June 2, 2021 21:38

Add MaxPool1D layer

7fe2314

Merge branch 'master' into fix-60

127e765

Update implementation based on recent changes

bf4fd9b

Merge branch 'master' into fix-60

59c28e3

zaleslaw added the Review This PR is under review label Jun 9, 2021

zaleslaw reviewed Jun 9, 2021

View reviewed changes

mkaze added 4 commits June 10, 2021 18:45

Merge branch 'master' into fix-60

699c70a

Remove dataFormat; Update poolSize and stride types; Update with late…

27b9f73

…st changes

Merge remote-tracking branch 'myfork/fix-60' into fix-60

c9bd85b

Add copyright notice

9f7f47b

mkaze requested a review from zaleslaw June 10, 2021 14:53

zaleslaw requested changes Jun 11, 2021

View reviewed changes

api/src/main/kotlin/org/jetbrains/kotlinx/dl/api/core/layer/pooling/MaxPool1D.kt Outdated Show resolved Hide resolved

api/src/main/kotlin/org/jetbrains/kotlinx/dl/api/core/layer/pooling/MaxPool1D.kt Outdated Show resolved Hide resolved

Change strides property type and default value

e2da412

mkaze requested a review from zaleslaw June 11, 2021 13:52

Fix error message for invalid strides

33fab44

zaleslaw merged commit 09ca865 into Kotlin:master Jun 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add MaxPool1D layer #96

Add MaxPool1D layer #96

Uh oh!

mkaze commented Jun 2, 2021

Uh oh!

zaleslaw Jun 9, 2021

Uh oh!

zaleslaw Jun 9, 2021

Uh oh!

zaleslaw Jun 9, 2021

Uh oh!

mkaze commented Jun 10, 2021

Uh oh!

Uh oh!

Uh oh!

mkaze commented Jun 11, 2021

Uh oh!

zaleslaw commented Jun 11, 2021

Uh oh!

mkaze commented Jun 11, 2021 •

edited

Loading

Uh oh!

zaleslaw commented Jun 11, 2021

Uh oh!

mkaze commented Jun 11, 2021

Uh oh!

Uh oh!

Add MaxPool1D layer #96

Add MaxPool1D layer #96

Uh oh!

Conversation

mkaze commented Jun 2, 2021

Uh oh!

zaleslaw Jun 9, 2021

Choose a reason for hiding this comment

Uh oh!

zaleslaw Jun 9, 2021

Choose a reason for hiding this comment

Uh oh!

zaleslaw Jun 9, 2021

Choose a reason for hiding this comment

Uh oh!

mkaze commented Jun 10, 2021

Uh oh!

Uh oh!

Uh oh!

mkaze commented Jun 11, 2021

Uh oh!

zaleslaw commented Jun 11, 2021

Uh oh!

mkaze commented Jun 11, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zaleslaw commented Jun 11, 2021

Uh oh!

mkaze commented Jun 11, 2021

Uh oh!

Uh oh!

mkaze commented Jun 11, 2021 •

edited

Loading