我对 Google Cloud Platform 还很陌生,我正在尝试使用 TPU 训练模型。我按照本教程使用 Google Colab 设置 TPU。下面的所有代码都遵循教程。这是我完成的步骤:import datetimeimport jsonimport osimport pprintimport randomimport stringimport sysimport tensorflow as tfassert 'COLAB_TPU_ADDR' in os.environ, 'ERROR: Not connected to a TPU runtime; please see the first cell in this notebook for instructions!'TPU_ADDRESS = 'grpc://' + os.environ['COLAB_TPU_ADDR']print('TPU address is => ', TPU_ADDRESS)from google.colab import authauth.authenticate_user()with tf.Session(TPU_ADDRESS) as session: print('TPU devices:') pprint.pprint(session.list_devices()) # Upload credentials to TPU. with open('/content/adc.json', 'r') as f: auth_info = json.load(f) tf.contrib.cloud.configure_gcs(session, credentials=auth_info) # Now credentials are set for all future sessions on this TPU.输出:TPU address is => grpc://10.4.89.154:8470提供我的BUCKET名字和OUPUT DIRECTORY姓名:BUCKET = 'my_xlnet' #@param {type:"string"}assert BUCKET, '*** Must specify an existing GCS bucket name ***'output_dir_name = 'xlnet_output' #@param {type:"string"}BUCKET_NAME = 'gs://{}'.format(BUCKET)OUTPUT_DIR = 'gs://{}/{}'.format(BUCKET,output_dir_name)tf.gfile.MakeDirs(OUTPUT_DIR)print('***** Model output directory: {} *****'.format(OUTPUT_DIR))将预训练模型移至 GCS 存储桶:!gsutil mv /content/xlnet_extension_tf/model/xlnet_cased_L-24_H-1024_A-16 $BUCKET_NAME输出:...Operation completed over 5 objects/1.3 GiB. 然后运行主要代码:!python /content/xlnet_extension_tf/run_coqa.py \--use_tpu=True \--tpu_name=grpc://10.4.89.154:8470 \--spiece_model_file=$BUCKET_NAME/xlnet_cased_L-24_H-1024_A-16/spiece.model \--model_config_path=$BUCKET_NAME/xlnet_cased_L-24_H-1024_A-16/xlnet_config.json \--init_checkpoint=$BUCKET_NAME/xlnet_cased_L-24_H-1024_A-16/xlnet_model.ckpt \...然后我得到了这个错误:OSError: Not found: "gs://my_xlnet/xlnet_cased_L-24_H-1024_A-16/spiece.model": No such file or directory Error #2这是 GCS 存储桶屏幕:我不知道为什么会出现这个错误,因为我可以成功地将我的预训练模型移动到桶中。你们知道如何解决这个问题吗?
添加回答
举报
0/150
提交
取消