Skip to content

A set of general optimisations #916

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Feb 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
ff6345b
optimise _calc_abc
shawnlaffan Feb 7, 2024
5e78dbd
_calc_abc: more micro-optimisations
shawnlaffan Feb 7, 2024
c0d2260
_calc_abc: micro-optimise the label_hash loop
shawnlaffan Feb 7, 2024
5987391
_calc_abc: micro-optimise label_list loop
shawnlaffan Feb 7, 2024
af7f23e
_calc_abc: avoid some looping when assigning to empty hashes
shawnlaffan Feb 8, 2024
736601d
hierarchical labels: remove calc_abc pre_calc
shawnlaffan Feb 8, 2024
a450987
Indices: add _calc_abc_one_element
shawnlaffan Feb 9, 2024
51b65c2
micro-optimise Spatial::compare
shawnlaffan Feb 9, 2024
7d6c576
Spatial::compare: reorder some conditions
shawnlaffan Feb 9, 2024
f361432
micro-optimise BaseData::get_range
shawnlaffan Feb 9, 2024
672d582
BaseData::transfer_element_properties return early if no properties
shawnlaffan Feb 11, 2024
5fc5e65
BaseStruct.pm: better feedback in some error messages
shawnlaffan Feb 11, 2024
eb1e22c
calc_abc[123]: only use one element variants if there is one element
shawnlaffan Feb 11, 2024
44c3aa3
Randomisation tests: ensure spatial index is used for a spatial analysis
shawnlaffan Feb 11, 2024
8db0fd7
Indices: faster range hash generation
shawnlaffan Feb 11, 2024
e99b1e0
add array args variants of get_list_ref
shawnlaffan Feb 11, 2024
9959786
calc_abc[123]: improve one element variant detection
shawnlaffan Feb 11, 2024
a7683f4
BaseStruct.pm: Add array args variant of exists_list
shawnlaffan Feb 11, 2024
2388630
rand_curveball: no need to sort the shorter swap list
shawnlaffan Feb 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 8 additions & 6 deletions lib/Biodiverse/BaseData.pm
Original file line number Diff line number Diff line change
Expand Up @@ -1591,21 +1591,23 @@ sub transfer_element_properties {
my $to_bd = $args{receiver} || croak "Missing receiver argument\n";
my $remap = $args{remap} || {}; # remap hash

my $progress_bar = Biodiverse::Progress->new();

my $type = $args{type};
croak "argument 'type => $type' is not valid (must be groups or labels)\n"
if not( $type eq 'groups' or $type eq 'labels' );
my $get_ref_sub = $type eq 'groups' ? 'get_groups_ref' : 'get_labels_ref';

my $elements_ref = $self->$get_ref_sub;

return if !$elements_ref->has_element_properties;

my $to_elements_ref = $to_bd->$get_ref_sub;

my $name = $self->get_param('NAME');
my $to_name = $to_bd->get_param('NAME');
my $text = "Transferring $type properties from $name to $to_name";

my $total_to_do = $elements_ref->get_element_count;
my $progress_bar = Biodiverse::Progress->new();
my $total_to_do = $elements_ref->get_element_count;
print "[BASEDATA] Transferring properties for $total_to_do $type\n";

my $count = 0;
Expand Down Expand Up @@ -2052,9 +2054,9 @@ sub get_range {

my $variety = $labels_ref->get_variety(@_);

my $range = max( ( $props->{RANGE} // -1 ), $variety );

return $range;
return defined $props
? max( ( $props->{RANGE} // -1 ), $variety )
: $variety;
}

# for backwards compatibility
Expand Down
61 changes: 48 additions & 13 deletions lib/Biodiverse/BaseStruct.pm
Original file line number Diff line number Diff line change
Expand Up @@ -387,7 +387,6 @@ sub get_element_name_as_array_aa {
$element_list_ref->[0] //= ($quote_char . $quote_char)
}
else {
my $quotes = $quote_char;
for my $el (@$element_list_ref) {
$el //= $EMPTY_STRING;
}
Expand Down Expand Up @@ -529,7 +528,7 @@ sub get_sub_element_list {

no autovivification;

my $element = $args{element} // croak "argument 'element' not specified\n";
my $element = $args{element} // croak "argument 'element' not specified in get_sub_element_list\n";

my $el_hash = $self->{ELEMENTS}{$element}{SUBELEMENTS}
// return;
Expand All @@ -544,7 +543,7 @@ sub get_sub_element_hash {
no autovivification;

my $element = $args{element}
// croak "argument 'element' not specified\n";
// croak "argument 'element' not specified in get_sub_element_hash\n";

# Ideally we should throw an exception, but at the moment too many other
# things need a result and we aren't testing for them.
Expand All @@ -567,7 +566,7 @@ sub get_sub_element_hash_aa {

no autovivification;

croak "argument 'element' not specified\n"
croak "argument 'element' not specified in get_sub_element_hash_aa\n"
if !defined $element;

# Ideally we should throw an exception, but at the moment too many other
Expand All @@ -582,7 +581,7 @@ sub get_sub_element_hash_aa {
sub get_sub_element_href_autoviv_aa {
my ($self, $element) = @_;

croak "argument 'element' not specified\n"
croak "argument 'element' not specified in get_sub_element_href_autoviv_aa\n"
if !defined $element;

return $self->{ELEMENTS}{$element}{SUBELEMENTS} //= {};
Expand Down Expand Up @@ -907,9 +906,9 @@ sub exists_sub_element {
#defined $args{element} || croak "Argument 'element' not specified\n";
#defined $args{subelement} || croak "Argument 'subelement' not specified\n";
my $element = $args{element}
// croak "Argument 'element' not specified\n";
// croak "Argument 'element' not specified in exists_sub_element\n";
my $subelement = $args{subelement}
// croak "Argument 'subelement' not specified\n";
// croak "Argument 'subelement' not specified in exists_sub_element\n";

no autovivification;
exists $self->{ELEMENTS}{$element}{SUBELEMENTS}{$subelement};
Expand Down Expand Up @@ -1074,6 +1073,17 @@ sub exists_list {
return;
}

sub exists_list_aa {
my ($self, $element, $list) = @_;

croak "element not specified\n" if not defined $element;
croak "list not specified\n" if not defined $list;

no autovivification;

return exists $self->{ELEMENTS}{$element}{$list};
}

sub add_lists {
my $self = shift;
my %args = @_;
Expand Down Expand Up @@ -1562,6 +1572,20 @@ sub get_list_ref {
return $el->{$list};
}

sub get_list_ref_aa {
my ($self, $element, $list) = @_;
no autovivification;
defined $list ? $self->{ELEMENTS}{$element}{$list} : undef;
}

sub get_list_ref_autoviv_aa {
my ($self, $element, $list) = @_;
no autovivification;
return if !exists $self->{ELEMENTS}{$element};
$self->{ELEMENTS}{$element}{$list} //= {};
}


sub rename_list {
my $self = shift;
my %args = @_;
Expand Down Expand Up @@ -1781,13 +1805,14 @@ sub get_base_stats { # calculate basestats for a single element
sub get_element_property_keys {
my $self = shift;

my $keys = $self->get_cached_value ('ELEMENT_PROPERTY_KEYS');
state $cache_name = 'ELEMENT_PROPERTY_KEYS';
my $keys = $self->get_cached_value ($cache_name);

return wantarray ? @$keys : $keys if $keys;

my @keys = $self->get_hash_list_keys_across_elements (list => 'PROPERTIES');

$self->set_cached_value ('ELEMENT_PROPERTY_KEYS' => \@keys);
$self->set_cached_value ($cache_name => \@keys);

return wantarray ? @keys : \@keys;
}
Expand Down Expand Up @@ -1869,10 +1894,20 @@ sub get_element_properties_summary_stats {

sub has_element_properties {
my $self = shift;

my @keys = $self->get_element_property_keys;

return scalar @keys;

my $keys = $self->get_element_property_keys // [];

return scalar @$keys;
}

# maybe should cache
sub has_element_range_property {
my $self = shift;

my $prop_keys = $self->get_element_property_keys // [];
my $has_range_property = grep {$_ eq 'RANGE'} @$prop_keys;

return scalar $has_range_property;
}

# return true if the labels are all numeric
Expand Down
15 changes: 12 additions & 3 deletions lib/Biodiverse/Indices/Endemism.pm
Original file line number Diff line number Diff line change
Expand Up @@ -127,11 +127,20 @@ sub get_label_range_hash {
my $self = shift;

my $bd = $self->get_basedata_ref;
my $lb = $bd->get_labels_ref;
my $has_range_property = $lb->has_element_range_property;

my %range_hash;

foreach my $label ($bd->get_labels) {
$range_hash{$label} = $bd->get_range (element => $label);
if ($has_range_property) {
foreach my $label ($bd->get_labels) {
$range_hash{$label} = $bd->get_range(element => $label);
}
}
else {
# use more direct calculation if no label property for range
foreach my $label ($bd->get_labels) {
$range_hash{$label} = $lb->get_variety_aa($label);
}
}

my %results = (label_range_hash => \%range_hash);
Expand Down
4 changes: 2 additions & 2 deletions lib/Biodiverse/Indices/HierarchicalLabels.pm
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,7 @@ The number of list elements generated depends on how many axes are used in the l
Axes are order from zero as the highest level in the hierarchy,
so index 0 is the top level of the hierarchy.

Note that this calculation prodices lists since version 4.99_002
Note that this calculation produces lists since version 4.99_002
so one can no longer use the SUMRAT indices for clustering.
This can be re-enabled if there is a need.
END_H_DESC
Expand All @@ -70,8 +70,8 @@ END_H_DESC
type => 'Hierarchical Labels',
reference => $ref,
indices => \%indices,
# these are not used any more - should get the number of label axes directly
pre_calc_global => 'get_basedatas_by_label_hierarchy',
pre_calc => 'calc_abc', # we need the element lists
uses_nbr_lists => 2, # how many sets of lists it must have
);

Expand Down
Loading