Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

decrease memory utilization #50

Merged
merged 3 commits into from
Jan 17, 2024
Merged

decrease memory utilization #50

merged 3 commits into from
Jan 17, 2024

Conversation

d-hrs
Copy link
Contributor

@d-hrs d-hrs commented Jan 12, 2024

description

to avoid oom error such as following.

Status: Status{code=CANCELLED, description=Failed to read message., cause=java.lang.OutOfMemoryError: GC overhead limit exceeded
	at com.google.ads.googleads.v14.resources.Campaign.newBuilderForType(Campaign.java:18066)
	at com.google.ads.googleads.v14.resources.Campaign.newBuilderForType(Campaign.java:13)
	at com.google.protobuf.GeneratedMessageV3.newBuilderForType(GeneratedMessageV3.java:555)
	at com.google.protobuf.SingleFieldBuilderV3.getBuilder(SingleFieldBuilderV3.java:131)
	at com.google.ads.googleads.v14.services.GoogleAdsRow$Builder.mergeFrom(GoogleAdsRow.java:11160)
	at com.google.ads.googleads.v14.services.GoogleAdsRow$1.parsePartialFrom(GoogleAdsRow.java:37289)
	at com.google.ads.googleads.v14.services.GoogleAdsRow$1.parsePartialFrom(GoogleAdsRow.java:37281)
	at com.google.protobuf.CodedInputStream$ArrayDecoder.readMessage(CodedInputStream.java:883)
	at com.google.ads.googleads.v14.services.SearchGoogleAdsResponse$Builder.mergeFrom(SearchGoogleAdsResponse.java:707)
	at com.google.ads.googleads.v14.services.SearchGoogleAdsResponse$1.parsePartialFrom(SearchGoogleAdsResponse.java:1587)
	at com.google.ads.googleads.v14.services.SearchGoogleAdsResponse$1.parsePartialFrom(SearchGoogleAdsResponse.java:1579)
	at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:86)
	at com.google.protobuf.AbstractParser.parseFrom(AbstractParser.java:48)
	at io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller.parseFrom(ProtoLiteUtils.java:223)
	at io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller.parse(ProtoLiteUtils.java:215)
	at io.grpc.protobuf.lite.ProtoLiteUtils$MessageMarshaller.parse(ProtoLiteUtils.java:118)
	at io.grpc.MethodDescriptor.parseResponse(MethodDescriptor.java:284)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInternal(ClientCallImpl.java:660)
	at io.grpc.internal.ClientCallImpl$ClientStreamListenerImpl$1MessagesAvailable.runInContext(ClientCallImpl.java:647)
	at io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
	at io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
}.

it caused by iterate all pages and read result to memory. (implemented by #45)

response.iteratePages().iterator().forEachRemaining(pages::add);

running log

embulk config

Approximately 80,000 records can be retrieved under these config.yml in my env.

in:
  type: google_ads
  customer_id: ***
  login_customer_id: ***
  resource_type: expanded_landing_page_view
  client_id: ***
  client_secret: ***
  developer_token: ***
  refresh_token: ***
    daterange:
      start_date: '2020-01-01'
      end_date: '2024-01-15'
  fields:
  - name: customer.id
    type: string
  - name: campaign.id
    type: string
  - name: segments.ad_network_type
    type: string
  - name: segments.device
    type: string
  - name: ad_group.id
    type: string
  - name: metrics.active_view_impressions
    type: long
  - name: metrics.active_view_measurability
    type: double
  - name: metrics.active_view_measurable_cost_micros
    type: long
  - name: metrics.active_view_measurable_impressions
    type: long
  - name: metrics.active_view_viewability
    type: double
  - name: expanded_landing_page_view.expanded_final_url
    type: string
  - name: metrics.clicks
    type: long
  - name: metrics.conversions
    type: double
  - name: metrics.conversions_value
    type: double
  - name: metrics.cost_micros
    type: double
  - name: segments.date
    type: timestamp
    format: "%Y-%m-%d"
  - name: metrics.impressions
    type: long
  - name: metrics.interaction_event_types
    type: json
  - name: metrics.interactions
    type: long
  - name: metrics.speed_score
    type: double
  - name: metrics.all_conversions
    type: double
  conditions: []
  _replace_dot_in_column: true
  _use_micro: false
out:
  type: 'null'

The following are the jstat log of the run with -J-Xmx400m options.

jstat(0.1.26)

When S0, E, and O are reached 100%, OOM occured.

$ jstat -gcutil -h10 $(jps | grep embulk | cut -d' ' -f1) 5000
  S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT     GCT
  0.00 100.00  46.60  31.13  97.34  94.66      5    0.073     2    0.018    0.091
100.00   0.00  62.48  42.37  96.79  95.15      6    0.089     3    0.022    0.111
  0.00 100.00  82.01  41.26  97.33  95.56      7    0.105     4    0.055    0.161
100.00   0.00  55.57  55.24  97.49  95.25      8    0.131     4    0.055    0.186
  0.00 100.00  34.47  72.77  97.30  95.25      9    0.167     5    0.058    0.225
100.00   0.00  21.26  64.96  97.36  95.25     10    0.191     8    0.135    0.326
100.00   0.00  79.82  64.96  97.36  95.25     10    0.191     9    0.138    0.329
  0.00 100.00  40.59  79.08  97.40  95.25     11    0.224    13    0.209    0.434
100.00   0.00  18.28  93.30  97.16  95.25     12    0.255    16    0.280    0.535
100.00   0.00  86.39  93.30  97.16  95.25     12    0.255    20    0.351    0.606
  S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT     GCT
  0.00   0.00  63.04  99.99  97.21  95.25     13    0.255    27    0.833    1.088
  0.00   0.00  70.83 100.00  97.11  94.99     14    0.255    34    1.466    1.721
  0.00   0.00  97.72 100.00  97.12  94.99     15    0.255    39    2.059    2.314
100.00   0.00 100.00 100.00  97.13  94.99     18    0.255    48    3.252    3.507
100.00   0.00 100.00 100.00  97.14  94.99     19    0.255    68    7.920    8.175

jstat(this branch)

$ jstat -gcutil -h10 $(jps | grep embulk | cut -d' ' -f1) 5000
  S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT     GCT
  0.00 100.00  41.83  20.57  97.72  96.00      3    0.044     2    0.012    0.057
100.00   0.00  60.92  34.62  97.10  94.71      6    0.086     4    0.038    0.124
  0.00 100.00  40.84  42.13  97.30  96.19      7    0.100     4    0.038    0.139
 71.76   0.00  48.38  56.59  97.27  93.87     24    0.200     6    0.082    0.282
 55.15   0.00  37.77  63.69  97.11  93.88     40    0.263     6    0.082    0.344
 55.40   0.00  27.63  72.53  97.16  93.88     58    0.338     6    0.082    0.420
 57.11   0.00  15.79  81.38  97.20  93.88     76    0.413     6    0.082    0.495
  0.00  50.98  94.21  90.18  96.98  93.88     93    0.488     6    0.082    0.570
  0.00  53.45  76.62  52.29  97.02  93.88    111    0.557     8    0.126    0.683
  0.00  48.56  64.77  61.14  97.04  93.89    129    0.629     8    0.126    0.755
  S0     S1     E      O      M     CCS    YGC     YGCT    FGC    FGCT     GCT
 52.30   0.00  13.91  68.24  97.07  93.89    144    0.690     8    0.126    0.816
 49.59   0.00   3.96  77.14  97.09  93.89    162    0.757     8    0.126    0.883
  0.00  51.52  98.09  86.01  97.11  93.89    179    0.826     8    0.126    0.952
  0.00  51.38  88.25  94.90  97.14  93.89    197    0.899    10    0.168    1.068
 52.78   0.00  27.20  55.90  97.16  93.89    212    0.952    10    0.168    1.121
  0.00  81.23   1.91  64.74  97.18  93.89    231    1.027    10    0.168    1.195
  0.00  51.88  66.95  75.41  97.19  93.89    251    1.100    10    0.168    1.269
 49.30   0.00   0.00  86.11  97.20  93.89    272    1.181    10    0.168    1.350
  0.00  49.70  21.70  48.81  97.21  93.89    291    1.252    12    0.211    1.463
  0.00  47.48  94.44  57.73  97.23  93.89    310    1.320    12    0.211    1.531

@d-hrs d-hrs changed the title increase memory usage decrease memory utilization Jan 15, 2024
@d-hrs
Copy link
Contributor Author

d-hrs commented Jan 15, 2024

I will measure memory utilization and upload results later 🙏

}
}
} while (startDateTime != null && !startDateTime.isEmpty());
lastRow.getChangeEvent().getChangeDateTime();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code seems unused.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed: de40057


response.iteratePages().iterator().forEachRemaining(pages::add);
public void search(Consumer<Iterable<GoogleAdsServiceClient.SearchPage>> consumer, Map<String, String> params) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits: Although Iterator is called twice to get the last row, it is called once when the search method params change to per page.

public void search(Consumer<GoogleAdsServiceClient.SearchPage> consumer, Map<String, String> params) {
     Iterable<GoogleAdsServiceClient.SearchPage> pages = search(params);
    for (GoogleAdsServiceClient.SearchPage page : pages) {
        consumer.accept(page);
        lastPage = page;
    }

    if (!task.getResourceType().equals("change_event") || lastPage == null) {
        return;
     }

     ...

lastRow.getChangeEvent().getChangeDateTime();
Map<String, String> nextParams = new HashMap<>();
nextParams.put("start_datetime", lastRow.getChangeEvent().getChangeDateTime());
search(consumer, nextParams);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a risk that pages may not be released.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed: 003f153

I fixed to pass single SearchPage to consumer, and pages is removed.

@d-hrs d-hrs requested a review from chikamura January 17, 2024 08:20
@d-hrs
Copy link
Contributor Author

d-hrs commented Jan 17, 2024

@chikamura I wrote about memory usage in the p-roverview, and fixed code.
please review.

@d-hrs d-hrs requested a review from t3t5u January 17, 2024 08:22
}
pageBuilder.flush();
}
Map<String, String> params = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE:
Since params is empty, the initial value of start_datetime is null.

Copy link
Contributor

@t3t5u t3t5u left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@d-hrs d-hrs merged commit 036925b into master Jan 17, 2024
1 check passed
@d-hrs d-hrs deleted the fixed branch January 17, 2024 19:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants