-
Notifications
You must be signed in to change notification settings - Fork 111
Add MaxPool1D layer #96
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* @property [dataFormat] Data format of input; can be either of [CHANNELS_LAST], or [CHANNELS_FIRST]. | ||
*/ | ||
public class MaxPool1D( | ||
public val poolSize: Int = 2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use 3d long arrays here before we find the final solution in our discussion
public val poolSize: Int = 2, | ||
public val strides: Int? = null, | ||
public val padding: ConvPadding = ConvPadding.VALID, | ||
public val dataFormat: String = CHANNELS_LAST, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the data format
* NOTE: we can use `MaxPool.Options` argument of `tf.nn.maxPool` to pass | ||
* the data format, as follows: | ||
* ``` | ||
* val tfDataFormat = if (dataFormat == CHANNELS_LAST) "NHWC" else "NCHW_VECT_C" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
important note
@zaleslaw I just merged it with master branch and updated the implementation. |
api/src/main/kotlin/org/jetbrains/kotlinx/dl/api/core/layer/pooling/MaxPool1D.kt
Outdated
Show resolved
Hide resolved
api/src/main/kotlin/org/jetbrains/kotlinx/dl/api/core/layer/pooling/MaxPool1D.kt
Outdated
Show resolved
Hide resolved
@zaleslaw I changed the strides type and default value per your request. |
LGTM, could be merged, I've run TC |
@zaleslaw By the ways, I was wondering why running the tests on CI server take so long to complete. Is it due to the CI server, or are there a lot of tests? Although, those tests which involve fitting a model on a sample data might take longer, but I guess probably there is room for improvement to optimize the them. |
@mkaze on my laptop takes around 1,5 hours (16 GB RAM, modern CPU), but it takes 2~3 hours on CI. I could say that I don't try to reduce test/examples time, and it's a good checker with good coverage, which helps to find something which breaks something else. Another problem is the good asserts for the trained models: for example, we want to reach accuracy 0.7 ~0.8 on the task, but if we will run all trainings for the 1 epoch, it gives us unstable results (due to TF non-determinism, we have no full control on JVM side, unfortunately), if I run tests for 3+ epochs it multiplies running examples time on 3. I suppose that some of the longest examples could be excluded from the Gradle task examples:test or should be parametrized with number of epochs/batch size) and be different for one code snippet when it runs as an example or as a test. There are many new tests added during the last two weeks, and I'll revisit the current tests and examples and describe the goal of testing for this project. So, could I ask you, what exactly bothers you that tests take a long time to run? In any case, you have many valuable suggestions, including this, which makes the project more mature, and share your perspective, which I really appreciate. |
@zaleslaw Thanks a lot for the explanation. Yeah, I agree that it's not unusual to have tests take hours to complete especially in projects concerned with data processing. I was just thinking whether we can further improve the tests so that the development cycle duration could be reduced, i.e. we can get a feedback whether the changes are correct in a shorter period. I assume that the majority of test time is consumed by those tests which involve fitting a model on a dataset and validating whether the model has actually learned something (i.e. it reaches a reasonable train/validation accuracy). So, just as an example of a potential improvement, we can consider the possibility of reducing dataset size (e.g. using 10% of MNIST instead of all of it) or even using a small synthetic dataset, maybe not for all the tests but at least for a good fraction of them. Of course, this should be given further thought to evaluate its pros and cons. All in all, it sounds fine to me. As I said, I was just interested if this feedback loop could be further optimized and improved (just for the sake of efficiency!). I haven't yet taken a look at all the unit tests; so I would definitely take a look at them and if anything useful came to my mind, I would certainly share it with you. Thanks again! |
Resolves #60. This PR adds support for MaxPool1D layer.
@zaleslaw:
MaxPool2D
orConv2D
implementation, I haven't used a full-sized array (i.e. including batch and features) to specify pooling and stride size. Please let me know if this is not desired.