这是一篇关于开发一个基于深度神经网络(DNN)的预测建模框架的程序代写,需要利用高维基因组数据预测疾病风险的深度转移学习模型,以下是作业相关内容:
DLWrap
Deep transfer learning model for disease risk prediction using high-dimensional genomic data. Building an accurate disease risk prediction model is an essential step towards precision medicine. While high-dimensional genomic data provides valuable data resources for the investigations of disease risk, their huge amount of noise and the complex relationships between predictors and outcomes have brought tremendous analytical challenges. In this work, we have developed a deep neural network (DNN) based prediction modeling framework. We first proposed a group-wise feature importance score for feature selection, where genes harboring genetic variants with both linear and non-linear effects are efficiently detected. We then developed an explainable transfer-learning based DNN method, which can directly incorporate information from feature selection and accurately capture complex predictive effects. The proposed DNN-framework is computationally efficient and can be applied to genome-wide data. Through extensive simulation and real data analyses, we have demonstrated that our proposed method can not only efficiently detect predictive features, but also accurately predict disease risk, as compared to many existing methods.
Install
The requirements listed in the requirements.txt are only what’s required to install this package so you can use it as a module. They aren’t sufficient to actually run all of the scripts and notebooks. In addition, you will need:
python=3.6.13
sklearn
numpy==1.19.5
pandas==1.1.5
statistics==1.0.3.5
keras==2.4.3
scipy==1.5.4
tensorflow==2.4.0
Procedure parameters
Procedure parameters
TestSelection:Analysis to be performed.
TestSelection=1: Perform screening analysis for one gene;
TestSelection=2: Perform screening analysis for multiple genes;
TestSelection=3: Perform prediction analysis given screening results;
TestSelection=4: Perform both screening and prediction analyses;
TestSelection=5: Perform prediction with selected gene.
Data-related parameters
train_y_input: Phenotype of training set: FID, IID, Y and it should have header.default=None
test_y_input: Phenotype of testing set. It is in the same format as the training set.default=None
binary_outcome: Whether the outcome is binary: =1 yes binary; =0 no continuous. Default is continuous (i.e., =0) .
1.only for one gene
train_x_input: Genotype of training set,the comma delimited version with the frist 6 columns being FID, IID, PAT, MAT, SEX, PHENOTYPE, and the rest is SNPs. It is in plink .raw format and should have header. Valid when only one gene is to be screened.default=None
gene_name: The name of the gene. If not provided, it will be 0. Only valid when screening_index=1.default=None
test_groups: Groups of tests that need to be performed. If not provided, then all genes will be grouped together. Valid when only one gene is to be screened.default=None
2.multiple genes
GeneticDataTrain: The list of training genotype files. Each file contains genotypes of training set, that is the comma delimited version with the frist 6 columns being
FID,IID,PAT,MAT,SEX,PHENOTYPE, and the rest is SNPs. It is in plink .raw format and should have header.default=None
GeneticDataTest: The list of testing genotype files. Each file contains genotypes of testing set, that is the comma delimited version with the frist 6 columns being FID,IID,PAT,MAT,SEX,PHENOTYPE, and the rest is SNPs. It is in plink .raw format and should have header. The files should be ordered according to the same order in the GeneticDataTrain (i.e., in the same gene order).default=None
geneindexFile: The list of gene_names, and it should be in the same order as the GeneticDataTrain folder.default=None
test_groupsFile: The list of groups of test files. It should be in the same order as the GeneticDataTrain folder. Each file contains the grouping for each gene.—–NOT SUPPORTED FOR THE MOMENT!default=None
Output-related parameters
AssociationDir: The folder where screening results are saved.default=None
outputPredFile: The predicted value output.default=None
modules for MLP parameters
seed_value: The random_seed to be set. Default 0. default=0
nunit1: The number of hidden units in the first hidden layer. The default is 50. Valid only when screening is performed.
nunit2: The number of hidden units in the second hidden layer. The default is 10. Valid only when screening is performed.
……
程序辅导定制C/C++/JAVA/安卓/PYTHON/留学生/PHP/APP开发/MATLAB

本网站支持 Alipay WeChatPay PayPal等支付方式
E-mail: vipdue@outlook.com 微信号:vipnxx
如果您使用手机请先保存二维码,微信识别。如果用电脑,直接掏出手机果断扫描。
