Hey all,
I am working with dataset uploading and I stumbled upon something.
|
def publish(self): |
|
"""Publish the dataset on the OpenML server. |
|
|
|
Upload the dataset description and dataset content to openml. |
|
|
|
Returns |
|
------- |
|
return_code : int |
|
Return code from server |
|
|
|
return_value : string |
|
xml return from server |
|
""" |
|
|
|
file_elements = {'description': self._to_xml()} |
|
file_dictionary = {} |
|
|
|
if self.data_file is not None: |
|
file_dictionary['dataset'] = self.data_file |
|
|
|
return_value = _perform_api_call("/data/", file_dictionary=file_dictionary, |
|
file_elements=file_elements) |
|
|
|
self.dataset_id = int(xmltodict.parse(return_value)['oml:upload_data_set']['oml:id']) |
|
return self |
The function publish() in the OpenMLDataset makes use of the xml description of a dataset and an arff file to upload a dataset at OpenML. However in the way that the class is implemented right now, self.data_file is a string containing the path to the dataset file.
In my opinion we should have a method that takes the description and the arff file as an argument at openml.datasets at the functions module.
Something like:
publish_dataset(description, file)
What is your opinion regarding this?
Hey all,
I am working with dataset uploading and I stumbled upon something.
openml-python/openml/datasets/dataset.py
Lines 373 to 397 in f618d81
The function publish() in the OpenMLDataset makes use of the xml description of a dataset and an arff file to upload a dataset at OpenML. However in the way that the class is implemented right now, self.data_file is a string containing the path to the dataset file.
In my opinion we should have a method that takes the description and the arff file as an argument at openml.datasets at the functions module.
Something like:
publish_dataset(description, file)What is your opinion regarding this?