recommenders-team/recommenders

RBM - how to get the affinity matrix from item_back_dict and user_back_dict

Open

#868 opened on Jul 18, 2019

View on GitHub
 (4 comments) (0 reactions) (0 assignees)Python (2,972 forks)batch import
help wanted

Repository metrics

Stars
 (17,706 stars)
PR merge metrics
 (Avg merge 6d 16h) (10 merged PRs in 30d)

Description

Description

I am trying to implement AzureML Hyperdrive based hyperparameter tuning of the RBM algorithm using example notebooks. I have a working RBM notebook with my dataset and I am using svd_training.py as an template for building my rbm_training,py file. As part of the RBM process an affinity matrix is created and the training and test set is built from the stratified sampler. I looked at the code and there is an optional parameter save_path that stores 4 numpy output files: item_back_dict.npy, item_dict.npy, user_back_dict.npy and user_dict.npy after invoking as follows

am1m = AffinityMatrix(DF = data, **header, save_path = DATA_DIR)

I am uploading the train and validate pkl data files to the default datastore from my local machine

During evaluation the following code requires the affinity matrix

top_k_df_1m = am1m.map_back_sparse(top_k_1m, kind = 'prediction') test_df_1m = am1m.map_back_sparse(Xtst_1m, kind = 'ratings')

How do I regenerate the affinity matrix object in the script that will be run remotely (rbm_training.py)? I was hoping to be able to use the four numpy files to enable map_back_sparse? I hope I don't have to upload the entire dataset and then regenerate an AffinityMatrix object remotely.

The AffinityMatrix code in sparse.py mentions that the numpy files can be use with a trained model but not sure how to load these 4 files to regenerate an AffinityMatrix object as the remote script executes.

Other Comments

Contributor guide