musif.extract package¶
Submodules¶
musif.extract.common module¶
musif.extract.constants module¶
- musif.extract.constants.GLOBAL_TIME_SIGNATURE = 'global_ts'¶
The name used for the column indicating the global time signature
- musif.extract.constants.ID = 'Id'¶
The name used for the column of the music score’s id
- musif.extract.constants.MUSIC21_FILE_EXTENSIONS = ['.xml', '.mxl', '.musicxml', '.mid', '.mei']¶
Extensions used by music21. Defaults to [“.xml”, “.mxl”, “.musicxml”, “.mid”, “.mei”]
- musif.extract.constants.PLAYTHROUGH = 'playthrough'¶
Constant for playthrough (count fo measures) added to ms3 dataframe
- musif.extract.constants.REQUIRE_MSCORE = ['harmony', 'scale_relative']¶
Names of modules taht require harmonic analysis in a .mscx file
- musif.extract.constants.VOICES_LIST = ['sop', 'ten', 'alt', 'bar', 'bbar', 'bass']¶
List of prefixes of singers’s names that might appear in the scores
- musif.extract.constants.WINDOW_ID = 'WindowId'¶
The name used for the column of the window’s id
- musif.extract.constants.WINDOW_RANGE = 'WindowRange'¶
The name used for the column indicating the start and end of a window
musif.extract.extract module¶
- class musif.extract.extract.FeaturesExtractor(*args, **kwargs)[source]¶
Bases:
object
Extract features for a score or a list of scores, according to the parameters established in the configuration files. It extracts musical features using music21 and ms3 library, based on the configuration and stores them in a dictionary (score features) that at the end will be returned as a DataFrame by the extract method.
During the parsing, unpitched objects, (e.g. objects referred to percussion instruments) may be removed (see the option remove_unpitched_objects in the configuration).
- __init__(*args, **kwargs)[source]¶
- Parameters:
*args (Could be a path to a .yml file, an AbstractExtractConfiguration object or a dictionary. Length zero or one.)
**kwargs (Get keywords to construct ExtractConfiguration.)
limit_files (List[str] = None) – List of file names relative to obj. Only these files are taken. Incompatible with exclude_files
exclude_files (List[str] = None) – List of file names relative to obj. None of these files are taken. Incompatible with limit_files
- Raises:
TypeError –
If the type is not the expected (str, dict or ExtractConfiguration).
ValueError –
If there is too many arguments(args)
FileNotFoundError –
If any of the files/directories path inside the expected configuration doesn’t exit.
- extract() pandas.DataFrame [source]¶
Extracts features given in the configuration data getting a file, directory or several file paths, returning a DataFrame containing musical features.
- Return type:
Score dataframe with the extracted features of given scores. For one score only, a DataFrem is returned with one row only.
- Raises:
ParseFileError – If the musicxml file can’t be parsed for any reason.
KeyError – If features aren’t loaded in corrected order or dependencies
- musif.extract.extract.find_files(extensions: str, base_dir: str | List[str | PurePath], limit_files: List[str] | None = None, exclude_files: List[str] | None = None) List[PurePath] [source]¶
Extracts the paths to files given an extension
Given a directory path, return a list of paths of files found, in alphabetic order. It searches recursively inside base_dir. If base_dir is a fileor a list of paths or directories with extension, it is returned in a list. If given neither a string nor list of strings raise a TypeError and if the file doesn’t exists returns a ValueError.
- Parameters:
extension (str or Iterable[str]) – A list of strings representing the extensions that will be looked for
base_dir (Union[str, Iterable[str]]) – A path or directory
limit_files (Iterable[str] = None) – List of file names relative to base_dir. Only these files are taken. Incompatible with exclude_files
exclude_files (Iterable[str] = None) – List of file names relative to base_dir. None of these files are taken. Incompatible with limit_files
- Returns:
resp – The list of musicxml files found in the provided arguments This list will be returned in alphabetical order
- Return type:
List[PurePath]
- Raises:
TypeError – If the type is not the expected (str or List[str]).
ValueError – If the provided string is neither a directory nor a file path
- musif.extract.extract.parse_filename(file_path: str, split_keywords: List[str], expand_repeats: bool = False, export_dfs_to: str | PurePath | None = None, remove_unpitched_objects: bool = True) music21.stream.Score [source]¶
This function parses a musicxml file and returns a music21 Score object. If the file has already been parsed, it will be loaded from cache instead of processing it again. Split a part in different parts if the instrument family is in keywords argument and expands repeats if indicated.
- Parameters:
file_path (str)
path. (A path to a music xml)
split_keywords (List[str]) – A lists of keywords based on music21 instrument sound names to split in different parts.
expand_repeats (bool) – Determines whether to expand or not the repetitions. Default value is False.
export_dfs_to (Union[str, PurePath]) – Path to a directory where dataframes containing the score data are exported. If None, no score is exported. Default value is None.
- Returns:
resp – The score saved in cache or the new score parsed with the necessary parts split.
- Return type:
Score
- Raises:
ParseFileError – If the xml file can’t be parsed for any reason.
- musif.extract.extract.parse_musescore_file(file_path: str, expand_repeats: bool = False) pandas.DataFrame [source]¶
This function parses a musescore file and returns a pandas dataframe. If the file has already been parsed, it will be loaded from cache instead of processing it again.
- Parameters:
file_path (str) – A path to a music mscx path.
expand_repeats (bool) – Determines whether to expand or not the repetitions. Default value is False.
- Returns:
resp – The score saved in cache or the new score parsed in the form of a dataframe.
- Return type:
pd.DataFrame
- Raises:
ParseFileError – If the musescore file can’t be parsed for any reason.
musif.extract.utils module¶
- musif.extract.utils.expand_score_repetitions(score, repeat_elements: list)[source]¶
Given a music21 Score object and a list containing repetition elements, expands the score object and places all measures in their correspondent cronological order :param score: Score object parsed by music21 :type score: music21 Score :param expand_repeats: List containing all repetition elements :type expand_repeats: list
- Returns:
final_score – Score object with expanded repetitions
- Return type:
music21 Score
- musif.extract.utils.extract_global_time_signature(score_data)[source]¶
Extracts a global time signature for the score for cases where is not possibel to get measure-by-measure TS
- musif.extract.utils.process_musescore_file(file_path: str, expand_repeats: bool = False) pandas.DataFrame [source]¶
Given a mscx file name, parses the file using ms3 library and returns a dataframe containing all harmonic information. Adds Playthrough column that contains number of every measure in the cronological order :param file_path: Path to mscx file :type file_path: str :param expand_repeats: Directory path to musescore file :type expand_repeats: bool
- Returns:
harmonic_analysis – Dataframe containing harmonic information
- Return type:
str