Options
All
• Public
• Public/Protected
• All

External module "transcription/transcription_utils"

Utility functions for Onsets and Frames models.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

Functions

batchInput

• batchInput(input: number[][], batchLength: number): Tensor<R3>

For batches in the middle (not the first or last), we pad the beginning and end with values from the previous and following batches to cover the receptive field.

We can't just use zero padding for the first and last batch since bias will be added to it, making it non-zero after the first convolution. This does not match how same padding works, which is reset to 0 at each layer. Instead we treat the first and last batch differently. The first batch has no initial padding and we include extra padding from the second batch on the end to make its length match. The final batch has no end padding and we include extra padding from the previous batch to the beginning to make its length match.

In most cases, the number of batches will equal ceil(input.shape[0] / batchLength). However, in rare cases where the final batch would be shorter than the receptive field, it is instead appended to the previous batch, reducing the final batch size by 1.

Parameters

• input: number[][]

The 2D input matrix, shaped [N, D].

• batchLength: number

The desired batch size (excluding receptive field padding). The final batch may be less or slightly more than this.

Returns Tensor<R3>

The 3D batched input, shaped [B, batchLength + RF_PAD * 2, D]

pianorollToNoteSequence

• pianorollToNoteSequence(frameProbs: tf.Tensor2D, onsetProbs: tf.Tensor2D, velocityValues: tf.Tensor2D, onsetThreshold?: number, frameThreshold?: number): Promise<NoteSequence>
• Converts the model predictions to a NoteSequence.

Parameters

• frameProbs: tf.Tensor2D

Probabilities of an active frame, shaped [frame, pitch].

• onsetProbs: tf.Tensor2D

Probabilities of an onset, shaped [frame, pitch].

• velocityValues: tf.Tensor2D

Predicted velocities in the range [0, 127], shaped [frame, pitch].

Returns Promise<NoteSequence>

A NoteSequence containing the transcribed piano performance.

unbatchOutput

• unbatchOutput(batches: tf.Tensor3D, batchLength: number, totalLength: number): Tensor<R3>
• Unbatches the input, reversing the procedure of batchInput.

Parameters

• batches: tf.Tensor3D

The batched input matrix.

• batchLength: number

The desired batch size (excluding receptive field padding). The final batch may be less or slightly more than this.

Returns Tensor<R3>

The batched input, shaped [N, batchLength + RF_PAD * 2]

Generated using TypeDoc