diff --git a/README.md b/README.md index f5d0608..2442525 100644 --- a/README.md +++ b/README.md @@ -171,7 +171,7 @@ This command creates two files my-dsvm-env.compute and my-dsvm-env.runconfig und To set up Spark environment, run the following command in the CLI: - az ml computetarget attach --name my-spark-env --address -ssh.azurehdinsight.net --username --password --type cluster + az ml computetarget attach --name my-spark-env --address -ssh.azurehdinsight.net --username --password --cluster with the name of the cluster, cluster's SSH user name and password. The default value of SSH user name is `sshuser`, unless you changed it during provisioning of the cluster. The name of the cluster can be found in Properties section of your cluster page in Azure portal: diff --git a/code/01_data_acquisition_and_understanding/ReadMe.md b/code/01_data_acquisition_and_understanding/ReadMe.md index 5dc6226..ad18755 100644 --- a/code/01_data_acquisition_and_understanding/ReadMe.md +++ b/code/01_data_acquisition_and_understanding/ReadMe.md @@ -46,11 +46,11 @@ To run this script into the HDInsight Spark cluster, ``` az ml experiment submit -c my-spark-env 1_Download_and_Parse_XML_Spark.py ``` - where my-spark-env is the Spark environment defined in the [configuration step](../../ReadMe.md). +where my-spark-env is the Spark environment defined in the [configuration step](../../README.md). ### Notes - There are more that 800 XML files that are present on the Medline ftp server. The shared code downloads them all which takes a long time. If you just want to test the code, you can change that and download only a subsample. - The source code of the PubMed Parser is also included in the repository. ### Next Step -2. [Modeling](./code/02_modeling/ReadMe.md) +2. [Modeling](../02_modeling/ReadMe.md)