Discussion:
[Aide] AIDE performance problems
Paul Hessels
2015-04-05 21:11:25 UTC
Permalink
I have a rather large dataset that I am running AIDE against. I am running
AIDE on my backup server that has a bunch of systems backups on it. The
server in question has about 5TB of data and about 200 million files.

Currently its taking about 27 hours to run. The system is 15 drives in
raid0 with XFS.

Oddly, the bottle neck doesn't seem to be disk. iostat lists it as only
25% util. Simple dd tests suggest that percentage is right for both
throughput and iops.

The CPU doesn't seem to be the bottleneck. Its about 65% idle most of the
time. The process seems to be sitting in the 'D' state most of the time.

The number of interrupts seems reasonable...

I don't know what to do next. I need this to run in 20 hours or less.
David Duccini
2015-04-05 21:48:29 UTC
Permalink
I don't think it is multi-threaded ?

Or not so by default

Sent from my iPhone
I have a rather large dataset that I am running AIDE against. I am running AIDE on my backup server that has a bunch of systems backups on it. The server in question has about 5TB of data and about 200 million files.
Currently its taking about 27 hours to run. The system is 15 drives in raid0 with XFS.
Oddly, the bottle neck doesn't seem to be disk. iostat lists it as only 25% util. Simple dd tests suggest that percentage is right for both throughput and iops.
The CPU doesn't seem to be the bottleneck. Its about 65% idle most of the time. The process seems to be sitting in the 'D' state most of the time.
The number of interrupts seems reasonable...
I don't know what to do next. I need this to run in 20 hours or less.
_______________________________________________
Aide mailing list
https://mailman.cs.tut.fi/mailman/listinfo/aide
Andy Lawrence
2015-04-05 21:50:08 UTC
Permalink
Are you concerned with intrusion detection or only mystery bit flips? If
only bit flips change your hash on that directory to only md5, will be much
faster.
Post by Paul Hessels
I have a rather large dataset that I am running AIDE against. I am running
AIDE on my backup server that has a bunch of systems backups on it. The
server in question has about 5TB of data and about 200 million files.
Currently its taking about 27 hours to run. The system is 15 drives in
raid0 with XFS.
Oddly, the bottle neck doesn't seem to be disk. iostat lists it as only
25% util. Simple dd tests suggest that percentage is right for both
throughput and iops.
The CPU doesn't seem to be the bottleneck. Its about 65% idle most of the
time. The process seems to be sitting in the 'D' state most of the time.
The number of interrupts seems reasonable...
I don't know what to do next. I need this to run in 20 hours or less.
_______________________________________________
Aide mailing list
https://mailman.cs.tut.fi/mailman/listinfo/aide
--
projecthuh.com
Never underestimate the carelessness of boredom...
Most people prefer Windows because most people are idiots...
Paul Hessels
2015-04-05 22:26:33 UTC
Permalink
I failed to mention, I am only using md5 as it stands.
Post by Andy Lawrence
Are you concerned with intrusion detection or only mystery bit flips? If
only bit flips change your hash on that directory to only md5, will be much
faster.
Post by Paul Hessels
I have a rather large dataset that I am running AIDE against. I am
running AIDE on my backup server that has a bunch of systems backups on
it. The server in question has about 5TB of data and about 200 million
files.
Currently its taking about 27 hours to run. The system is 15 drives in
raid0 with XFS.
Oddly, the bottle neck doesn't seem to be disk. iostat lists it as only
25% util. Simple dd tests suggest that percentage is right for both
throughput and iops.
The CPU doesn't seem to be the bottleneck. Its about 65% idle most of
the time. The process seems to be sitting in the 'D' state most of the
time.
The number of interrupts seems reasonable...
I don't know what to do next. I need this to run in 20 hours or less.
_______________________________________________
Aide mailing list
https://mailman.cs.tut.fi/mailman/listinfo/aide
--
projecthuh.com
Never underestimate the carelessness of boredom...
Most people prefer Windows because most people are idiots...
_______________________________________________
Aide mailing list
https://mailman.cs.tut.fi/mailman/listinfo/aide
Paul Hessels
2015-04-06 01:23:16 UTC
Permalink
I will give that a try although its far from ideal.

Not that it likely matters, but I mis-read the number of files. Its only
20 million.

On Sun, Apr 5, 2015 at 7:51 PM, Brian Mathis <
Try to split the run across multiple processes by making different
aide.conf files that covers different sets of files, then start each
process with "-c config.conf" and see what happens. If it goes faster,
then you're CPU bound. If not, then it's IO.
❧ Brian Mathis
@orev
Post by Paul Hessels
I failed to mention, I am only using md5 as it stands.
Post by Andy Lawrence
Are you concerned with intrusion detection or only mystery bit flips?
If only bit flips change your hash on that directory to only md5, will be
much faster.
Post by Paul Hessels
I have a rather large dataset that I am running AIDE against. I am
running AIDE on my backup server that has a bunch of systems backups on
it. The server in question has about 5TB of data and about 200 million
files.
Currently its taking about 27 hours to run. The system is 15 drives in
raid0 with XFS.
Oddly, the bottle neck doesn't seem to be disk. iostat lists it as
only 25% util. Simple dd tests suggest that percentage is right for both
throughput and iops.
The CPU doesn't seem to be the bottleneck. Its about 65% idle most of
the time. The process seems to be sitting in the 'D' state most of the
time.
The number of interrupts seems reasonable...
I don't know what to do next. I need this to run in 20 hours or less.
_______________________________________________
Aide mailing list
https://mailman.cs.tut.fi/mailman/listinfo/aide
--
projecthuh.com
Never underestimate the carelessness of boredom...
Most people prefer Windows because most people are idiots...
_______________________________________________
Aide mailing list
https://mailman.cs.tut.fi/mailman/listinfo/aide
_______________________________________________
Aide mailing list
https://mailman.cs.tut.fi/mailman/listinfo/aide
_______________________________________________
Aide mailing list
https://mailman.cs.tut.fi/mailman/listinfo/aide
Richard van den Berg
2015-04-06 07:14:33 UTC
Permalink
The CPU doesn't seem to be the bottleneck. Its about 65% idle most of the time. The process seems to be sitting in the 'D' state most of the time.
How is your RAM utilization? Aide uses memmap by default, but if your system is RAM starved that might not be the best way to access files.

Kind regards,

Richard
Paul Hessels
2015-04-06 14:35:33 UTC
Permalink
The system has 12G of memory and while the test is running it seems to show
about 150meg free. I will install some tools to get better stats.
Post by Paul Hessels
The CPU doesn't seem to be the bottleneck. Its about 65% idle most of
the time. The process seems to be sitting in the 'D' state most of the
time.
How is your RAM utilization? Aide uses memmap by default, but if your
system is RAM starved that might not be the best way to access files.
Kind regards,
Richard
_______________________________________________
Aide mailing list
https://mailman.cs.tut.fi/mailman/listinfo/aide
Hannes von Haugwitz
2015-04-06 17:04:31 UTC
Permalink
Hi,
Post by Paul Hessels
I have a rather large dataset that I am running AIDE against. I am running
AIDE on my backup server that has a bunch of systems backups on it. The
server in question has about 5TB of data and about 200 million files.
Currently its taking about 27 hours to run.
Which version of AIDE are you using?

Best regards

Hannes
Paul Hessels
2015-04-06 17:30:18 UTC
Permalink
Its from debian wheezy:

# aide --version
Aide 0.15.1

Compiled with the following options:

WITH_MMAP
WITH_POSIX_ACL
WITH_SELINUX
WITH_XATTR
WITH_E2FSATTRS
WITH_LSTAT64
WITH_READDIR64
WITH_ZLIB
WITH_MHASH
CONFIG_FILE = "/dev/null"
Post by Paul Hessels
Hi,
Post by Paul Hessels
I have a rather large dataset that I am running AIDE against. I am
running
Post by Paul Hessels
AIDE on my backup server that has a bunch of systems backups on it. The
server in question has about 5TB of data and about 200 million files.
Currently its taking about 27 hours to run.
Which version of AIDE are you using?
Best regards
Hannes
_______________________________________________
Aide mailing list
https://mailman.cs.tut.fi/mailman/listinfo/aide
Hannes von Haugwitz
2015-04-08 19:48:09 UTC
Permalink
Hi,
Post by Paul Hessels
# aide --version
Aide 0.15.1
Last night I uploaded the version of Debian jessie
(0.16~a2.git20130520-3) to wheezy-backports. Please give this version a
try. It might perform a bit better.

Best regards

Hannes
Paul Hessels
2015-04-09 13:40:10 UTC
Permalink
Thank you Hannes, I will give it a try. It takes a long time to run my
tests of course.

With the old version I think I am encountering a memory leak, eventually
the aide process is using at 12G of ram and the machine dies. Hopefully
the new version doesn't have this problem.
Post by Hannes von Haugwitz
Hi,
Post by Paul Hessels
# aide --version
Aide 0.15.1
Last night I uploaded the version of Debian jessie
(0.16~a2.git20130520-3) to wheezy-backports. Please give this version a
try. It might perform a bit better.
Best regards
Hannes
_______________________________________________
Aide mailing list
https://mailman.cs.tut.fi/mailman/listinfo/aide
Paul Hessels
2015-04-08 19:57:50 UTC
Permalink
Thank you Hannes, I will give it a try. It takes a long time to run my
tests of course, so I will let this current test finish and then try the
upgrade.

In the meanwhile I think I am seeing a memory leak in the version I have.
The blank spot is after the machine died completely. Its an old version,
so I'll hold off more troubleshooting until I try the backport.

[image: Inline image 1]
Post by Hannes von Haugwitz
Hi,
Post by Paul Hessels
# aide --version
Aide 0.15.1
Last night I uploaded the version of Debian jessie
(0.16~a2.git20130520-3) to wheezy-backports. Please give this version a
try. It might perform a bit better.
Best regards
Hannes
_______________________________________________
Aide mailing list
https://mailman.cs.tut.fi/mailman/listinfo/aide
Loading...