[gpfsug-discuss] Gpfs Memory Usaage Keeps going up and we don't know why.
IBM Spectrum Scale
scale at us.ibm.com
Thu Aug 3 06:18:46 BST 2017
Can you provide the output of "pmap 4444"? If there's no "pmap" command on
your system, then get the memory maps of mmfsd from file of
/proc/4444/maps.
Regards, The Spectrum Scale (GPFS) team
------------------------------------------------------------------------------------------------------------------
If you feel that your question can benefit other users of Spectrum Scale
(GPFS), then please post it to the public IBM developerWroks Forum at
https://www.ibm.com/developerworks/community/forums/html/forum?id=11111111-0000-0000-0000-000000000479
.
If your query concerns a potential software error in Spectrum Scale (GPFS)
and you have an IBM software maintenance contract please contact
1-800-237-5511 in the United States or your local IBM Service Center in
other countries.
The forum is informally monitored as time permits and should not be used
for priority messages to the Spectrum Scale (GPFS) team.
From: Peter Childs <p.childs at qmul.ac.uk>
To: "gpfsug-discuss at spectrumscale.org"
<gpfsug-discuss at spectrumscale.org>
Date: 07/24/2017 10:22 PM
Subject: Re: [gpfsug-discuss] Gpfs Memory Usaage Keeps going up and
we don't know why.
Sent by: gpfsug-discuss-bounces at spectrumscale.org
top
but ps gives the same value.
[root at dn29 ~]# ps auww -q 4444
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 4444 2.7 22.3 10537600 5472580 ? S<Ll Jul12 466:13
/usr/lpp/mmfs/bin/mmfsd
Thanks for the help
Peter.
On Mon, 2017-07-24 at 14:10 +0000, Jim Doherty wrote:
How are you identifying the high memory usage?
On Monday, July 24, 2017 9:30 AM, Peter Childs <p.childs at qmul.ac.uk>
wrote:
I've had a look at mmfsadm dump malloc and it looks to agree with the
output from mmdiag --memory. and does not seam to account for the
excessive memory usage.
The new machines do have idleSocketTimout set to 0 from what your saying
it could be related to keeping that many connections between nodes
working.
Thanks in advance
Peter.
[root at dn29 ~]# mmdiag --memory
=== mmdiag: memory ===
mmfsd heap size: 2039808 bytes
Statistics for MemoryPool id 1 ("Shared Segment (EPHEMERAL)")
128 bytes in use
17500049370 hard limit on memory usage
1048576 bytes committed to regions
1 number of regions
555 allocations
555 frees
0 allocation failures
Statistics for MemoryPool id 2 ("Shared Segment")
42179592 bytes in use
17500049370 hard limit on memory usage
56623104 bytes committed to regions
9 number of regions
100027 allocations
79624 frees
0 allocation failures
Statistics for MemoryPool id 3 ("Token Manager")
2099520 bytes in use
17500049370 hard limit on memory usage
16778240 bytes committed to regions
1 number of regions
4 allocations
0 frees
0 allocation failures
On Mon, 2017-07-24 at 13:11 +0000, Jim Doherty wrote:
There are 3 places that the GPFS mmfsd uses memory the pagepool plus 2
shared memory segments. To see the memory utilization of the shared
memory segments run the command mmfsadm dump malloc . The statistics
for memory pool id 2 is where maxFilesToCache/maxStatCache objects are
and the manager nodes use memory pool id 3 to track the MFTC/MSC objects.
You might want to upgrade to later PTF as there was a PTF to fix a memory
leak that occurred in tscomm associated with network connection drops.
On Monday, July 24, 2017 5:29 AM, Peter Childs <p.childs at qmul.ac.uk>
wrote:
We have two GPFS clusters.
One is fairly old and running 4.2.1-2 and non CCR and the nodes run
fine using up about 1.5G of memory and is consistent (GPFS pagepool is
set to 1G, so that looks about right.)
The other one is "newer" running 4.2.1-3 with CCR and the nodes keep
increasing in there memory usage, starting at about 1.1G and are find
for a few days however after a while they grow to 4.2G which when the
node need to run real work, means the work can't be done.
I'm losing track of what maybe different other than CCR, and I'm trying
to find some more ideas of where to look.
I'm checked all the standard things like pagepool and maxFilesToCache
(set to the default of 4000), workerThreads is set to 128 on the new
gpfs cluster (against default 48 on the old)
I'm not sure what else to look at on this one hence why I'm asking the
community.
Thanks in advance
Peter Childs
ITS Research Storage
Queen Mary University of London.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
--
Peter Childs
ITS Research Storage
Queen Mary, University of London
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
--
Peter Childs
ITS Research Storage
Queen Mary, University of London
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://gpfsug.org/pipermail/gpfsug-discuss_gpfsug.org/attachments/20170803/c471c9f8/attachment-0001.htm>
More information about the gpfsug-discuss
mailing list