Skip to content

Conversation

mkaze
Copy link
Contributor

@mkaze mkaze commented Jun 2, 2021

Resolves #60. This PR adds support for MaxPool1D layer.

  • Implement layer class
  • Add unit tests for layer implementation
  • Add support for importing models containing this layer from JSON config files
  • Add support for exporting models containing this layer to JSON config files

@zaleslaw:

  • According to the point I mentioned here, unlike MaxPool2D or Conv2D implementation, I haven't used a full-sized array (i.e. including batch and features) to specify pooling and stride size. Please let me know if this is not desired.
  • Clearly the tests could be refactored to factor out their common parts; however, I thought it would be better to do it in another PR which hopefully would implement a mechanism for feeding data to pooling (or even all) layers in a TF session and test their output.

@zaleslaw zaleslaw added the Review This PR is under review label Jun 9, 2021
* @property [dataFormat] Data format of input; can be either of [CHANNELS_LAST], or [CHANNELS_FIRST].
*/
public class MaxPool1D(
public val poolSize: Int = 2,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use 3d long arrays here before we find the final solution in our discussion

public val poolSize: Int = 2,
public val strides: Int? = null,
public val padding: ConvPadding = ConvPadding.VALID,
public val dataFormat: String = CHANNELS_LAST,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the data format

* NOTE: we can use `MaxPool.Options` argument of `tf.nn.maxPool` to pass
* the data format, as follows:
* ```
* val tfDataFormat = if (dataFormat == CHANNELS_LAST) "NHWC" else "NCHW_VECT_C"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

important note

@mkaze
Copy link
Contributor Author

mkaze commented Jun 10, 2021

@zaleslaw I just merged it with master branch and updated the implementation.

@mkaze mkaze requested a review from zaleslaw June 10, 2021 14:53
@mkaze
Copy link
Contributor Author

mkaze commented Jun 11, 2021

@zaleslaw I changed the strides type and default value per your request.

@mkaze mkaze requested a review from zaleslaw June 11, 2021 13:52
@zaleslaw
Copy link
Collaborator

LGTM, could be merged, I've run TC

@mkaze
Copy link
Contributor Author

mkaze commented Jun 11, 2021

@zaleslaw By the ways, I was wondering why running the tests on CI server take so long to complete. Is it due to the CI server, or are there a lot of tests? Although, those tests which involve fitting a model on a sample data might take longer, but I guess probably there is room for improvement to optimize the them.

@zaleslaw
Copy link
Collaborator

@mkaze on my laptop takes around 1,5 hours (16 GB RAM, modern CPU), but it takes 2~3 hours on CI.
Of course, it could be reduced, especially when the functionality will be covered with the unit tests better when now.

I could say that I don't try to reduce test/examples time, and it's a good checker with good coverage, which helps to find something which breaks something else.

Another problem is the good asserts for the trained models: for example, we want to reach accuracy 0.7 ~0.8 on the task, but if we will run all trainings for the 1 epoch, it gives us unstable results (due to TF non-determinism, we have no full control on JVM side, unfortunately), if I run tests for 3+ epochs it multiplies running examples time on 3.

I suppose that some of the longest examples could be excluded from the Gradle task examples:test or should be parametrized with number of epochs/batch size) and be different for one code snippet when it runs as an example or as a test.

There are many new tests added during the last two weeks, and I'll revisit the current tests and examples and describe the goal of testing for this project.

So, could I ask you, what exactly bothers you that tests take a long time to run?
It seems to me that we are working quite quickly and efficiently. It seems to me that on many open source projects related to data processing and machine learning, tests take a long time, sometimes up to ten hours, i.e., I wanted to say that we have nothing unusual here, and even having optimized the tests, we are unlikely to be able to reduce their time by order of magnitude, ten or more times.

In any case, you have many valuable suggestions, including this, which makes the project more mature, and share your perspective, which I really appreciate.

@zaleslaw zaleslaw merged commit 09ca865 into Kotlin:master Jun 11, 2021
@mkaze
Copy link
Contributor Author

mkaze commented Jun 11, 2021

@zaleslaw Thanks a lot for the explanation. Yeah, I agree that it's not unusual to have tests take hours to complete especially in projects concerned with data processing. I was just thinking whether we can further improve the tests so that the development cycle duration could be reduced, i.e. we can get a feedback whether the changes are correct in a shorter period.

I assume that the majority of test time is consumed by those tests which involve fitting a model on a dataset and validating whether the model has actually learned something (i.e. it reaches a reasonable train/validation accuracy). So, just as an example of a potential improvement, we can consider the possibility of reducing dataset size (e.g. using 10% of MNIST instead of all of it) or even using a small synthetic dataset, maybe not for all the tests but at least for a good fraction of them. Of course, this should be given further thought to evaluate its pros and cons.

All in all, it sounds fine to me. As I said, I was just interested if this feedback loop could be further optimized and improved (just for the sake of efficiency!). I haven't yet taken a look at all the unit tests; so I would definitely take a look at them and if anything useful came to my mind, I would certainly share it with you. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Review This PR is under review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Audio] Add MaxPooling1D layer
2 participants