Hi,
I found it is easily to integrate snRNA and snATAC nuclei with satisfactory consistency between data modalities (which is what we expected), if less then 10000 nuclei were fed. However if large multiomic dataset with huge (>100,000) nuclei, there were difficulties to get good result. snATAC nuclei and snRNA nuclei tend to split far way with few overlap.
My question is , do I miss something of importance in all these steps (var gene selection, normalization, scale_not_center, online_iNMF/optimize_ALS, quantile norm and UMAP)? Could you please suggest some start points to tune ? I have struggled for days and learn a lot the parameters and I tuned each parameters.
I am almost lost my mind, please help!
Meijiao