![]() It's a neat idea, but technology isn't quite there yet so the only way to overcome these things is with more computing and a huge community effort. I'm completely neglecting the fact that Beat Sage would also need to be trained on tracks that have been separated, so there would be significant human effort needed to make the training set. Or if the Devs could make use of the code and training set for future beat sage releases. I was wondering if this could be useful for feeding beat sage separate components to get better maps. Sure the model above could be trained better, but it doesn't change the data resolution needed. Acapella Extractor Just saw this AI trained Acapella Extractor It is also capable of extracting other instrument streams separately or together (karaoke). And when I say significant, I mean the audio resolution needed to produce a clean track would be magnitudes greater than simply trying to extract generalized events, here's an article explaining what size 'pixels' is needed to produces this (hint: 11ms requires 500,000 samples at a sampling rate of 22,100hz, or about 82,000 720p images per 3:30 song): My guess is that using any kind of strings, wubs or other synthesized sounds wouldn't be reliable enough and would add significant server processing. Since vocals are easy enough to rip without AI for most tracks and drum rhythms can be extracted easily enough without the other stuff, it would probably be best left as a step from the user prior to submission. In general, it's more reliable to remove vocals, which have typical frequency ranges and centered in stereo, and drums, which have very defined peaks and also share stereo data. I'm not an expert or anything, but I've been following this kind of thing for a while.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |