Skip to content

Conversation

@yqzhishen
Copy link
Member

@yqzhishen yqzhishen commented Feb 23, 2024

Removed features and behaviors

  • Discrete F0 embedding type (temporarily reserved in ONNX exporter)
  • Code backup on training start
  • Random seeding during training
  • Linear domain of random time stretching augmentation
  • Migration script and guidance for transcriptions and checkpoints from version 1.X.

Other changes

  • interp_uv configuration is removed and forced to True.
  • train_set_name and valid_set_name are removed and forced to train and valid.
  • num_pad_tokens is removed and forced to 1.
  • ffn_padding is removed and forced to SAME.
  • g2p_dictionary configuration is removed in favor of dictionary.
  • pndm_speedup configuration is renamed to diff_speedup.
  • The duplicate txt_embed layer in FastSpeech2 encoder is removed.
  • Some configuration keys are now directly accessed so the configuration must contain them: dictionary, diff_accelerator, f0_embed_type, pe, pndm_speedup, use_key_shift_embed, use_speed_embed.

@yqzhishen yqzhishen marked this pull request as ready for review February 23, 2024 17:45
@yqzhishen yqzhishen marked this pull request as draft February 23, 2024 17:45
@yqzhishen yqzhishen marked this pull request as ready for review February 25, 2024 07:48
@yqzhishen yqzhishen merged commit c16095b into openvpi:main Feb 25, 2024
agentasteriski added a commit to agentasteriski/DiffSinger_colab_notebook_MLo7 that referenced this pull request Feb 27, 2024
No longer supported as of openvpi/DiffSinger#172
Raised config f0_max to 1600 to match existing binarizer edit
use_melody_encoder now enables along with predict_pitch
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant