How to build a fine-tuning dataset for code completion？

I want to implement code completion based on the company's self-developed component source code fine-tuning model. How should I build the dataset?
Is instruction based dialogue generation code built in this form?
{
"input":"#write a quick sort algorithm"
"output":"your quick sort algorithm code"
}
How to build a dataset based on code Insertion？（FIM）