Exclude some images / signature when importing a mail
It would be interested if we could specify a list of blacklisted image, for example to avoid images in the user's signature.
Parameters could be :
- name (if image is not embedded)
- Image : remi proposal OK
- Text : new table to store blacklisted text (first step store only text no HTML)
|related to GLPI-PROJECT - Feature #3246: Find solution to clean followups content before import on...||Closed||11/08/2011|
|blocked by GLPI Documentation - Task #4378: Exclude some images / signature when importing a mail||Closed||06/25/2013|
|blocked by GLPI Documentation - Task #4707: Exclude some images / signature when importing a mail||Closed||12/10/2013|
Updated by remi almost 3 years ago
Proposal : add a "is_blacklisted" attribute on document.
Then ignore it during import.
- Subject changed from Exclude some images when importing a mail to Exclude some images / signature when importing a mail
- Status changed from New to Resolved
- % Done changed from 50 to 100
Applied in changeset r21192.
- Status changed from Closed to Feedback
Please review this proposals to improove this feature :
1) Text Cleaner
There is a special table in DB where I stored regex for:
a) detecting start of signature (as they are a lot of different start of signatures in my group, then this is a list of regex records with a priority)
b) detecting end of signature (same than for start of signature regex type)
c) a list of regex used for cleaning ticket content and follow-ups content. this list is used with a preg_replace. this list is used to replace piece of texte by another piece of texte (mainly empty string smile). Of course this list is prioritized.
d) + same than for c) but this time to clean ticket titles.
2) Picture Cleaner
This time there are 2 tables in DB: one to store hashes for blacklisted pictures, and one to store the last date of refresh of the picture hash table.
+ there is a new folder called 'pictures' that contains blacklisted pictures. When I see pictures that must be blacklisted, I copy them to this folder (of course I've uploaded this folder with a bench of already known pictures from email signatures).
a) Each time a ticket is created (or a follow-up is created), I check the date of modification of the "pictures" folder, and then if the folder modification date is more recent than the last date stored in DB, then I refresh the hash table with the hashes of the pictures contained in the folder.
b) there is simply a hash computation of the uploaded pictures in the ticket and if it matches one of the hashes in the DB, then the document is deleted from the $_FILES array.