Cool HTML Tag and Attribute Online Stripper

Created: April 22, 2011
Last Modified: January 3, 2017
Subscribe to Internet Tips and Tools Feed
f Share
-
G+ Share
-
Tweet
-
in Share
-
P in it
-

I created this HTML stripper mainly because of the mess that programs like Word and Excel make out of HTML when you try to save a document as html or convert it to html. They make some very messy redundant html code. So this online HTML tag and attribute stripper removes all tags and attributes except for the ones you specify to allow.


Allowed Tags:

Allowed Attributes:

Also check out:
HTML to BBCODE Converter
HTML Entity to Text Converter
Sea Breeze Computers Home Page

Copyright © 2011 by Jeff Baker

History

2/23/2015 - ver. 1.1g - Unfortunately version 1.1f of the HTML stripper caused Cyrillic characters such as абв to be converted to абв . To fix this there is now a checkbox option to either "Convert HTML Entities to characters" or not. If left unchecked then the HTML stripper will act as version 1.1f. If checked then it will allow Cyrillic characters to be displayed but will convert HTML entities into their character equivalents.

11/25/2015 - ver. 1.1f - As noticed by Liam in the comments, the HTML stripper was converting HTML entities into text. Example: © ® ™ was converted to © ® ™ . I beleive I have fixed the problem. If the fix is causing some other problems please let me know.

5/4/2013 - ver. 1.1e - Again, MS word smart quotes were not being converted! I still don't know why this is happening. So I combined the old method with the new method for converting smart quotes and again they are converting properly.

3/15/2013 - ver. 1.1d - Increased the character limit to 100,000 characters on a trial basis.

3/5/2013 - ver. 1.1c - Bug Fix - I noticed that MS Word Smart quotes or curly quotes were no longer being converted correctly. I am not sure why. Maybe it was a server upgrade. So I changed to a hex way of converting smart quotes. So now MS Word "quotes" 'single quotes' dashes - and ... should convert properly.

10/05/2011 - ver. 1.1b - Bug Fix - The stripper was not removing attributes if they were on a new line. That has been fixed. For example, previously for the statement:

<a href="www.sample.com"
onclick="javascript">
the onlcick would not be removed because it was on a newline. Now it will be removed unless onclick is in the Allowed Attributes field.

08/24/2011 - ver. 1.1 - Increased the character limit to 30000 characters. Also the HTML Stripper now removes MS Word smart quotes and dashes from the document. If you are seeing the unicode replacement character FFFD in your html document then use this html tag stripper to remove it.

04/22/2011 - ver. 1.0 - Online HTML Tag and Attribute Stripper Created. Note: The HTML source is limited to 20000 characters. Tags that do not have a corresponding closing tag are also properly closed.

Back to www.seabreezecomputers.com
Subscribe to Internet Tips and Tools Feed        

User Comments

There are 50 comments.

Displaying first 35 comments.

1. Posted By: riverstore - - July 20, 2011, 8:00 pm
Thanks for the great tool! I use it to clean MS Word HTML

2. Posted By: Rob Anderson - - December 17, 2011, 3:18 pm
Just what I was looking for - thanks for making this available!

3. Posted By: Raphael - - November 12, 2012, 5:00 pm
Thanks a lot of this great script. I just had to clean up the most ugly code know to man, I don't know how I would of done it with you.

Revolution Graphics

4. Posted By: Nathan Kinsler - - January 29, 2013, 6:43 am
Excellent tool. This has saved us a lot of time. Thanks for sharing this.

BR.

Nathan.

5. Posted By: Robert Rudolf - - February 3, 2013, 2:03 am
Thank you! I was desperate to find a tool like this. Great work.

6. Posted By: Jeremy Ratliff - - February 8, 2013, 7:48 am
Thank you for this awesome tool, I too use it to clean up nasty MS Word HTML formatting.

7. Posted By: sebastiano - - March 8, 2013, 5:56 pm
thank you very much for this, but could you remove the html code size limit? I have very big articles and your script cut them. Hope you can help.

8. Posted By: al bundy - - March 12, 2013, 9:19 pm
Thank you, but the tool can't handle 190kb.

9. Posted By: Alex - - March 20, 2013, 3:22 am
Hi Jeff!

Cool thing!
Can you give the code of this webtool to use it only in my intranet-webpage?

Thank you very much.
With best regards, Alex Golovlev.
E:mail: a_g0[at]mail[dot]ru

10. Posted By: Nik - - March 24, 2013, 9:01 am
Nice tools, but it is stripping the class attribute out of span elements, when span is an element to not strip, and id and class are attributes not to strip

11. Posted By: Jeff - - March 25, 2013, 8:00 pm
Hi Nik,

I'm not sure why that is happening to you. If I list <span> in the allowed tags and class in the allowed attributes then the html stripper does not strip them for me. Maybe you could provide some of the code you are trying and I can try and see why you are having a problem.

Jeff
www.seabreezecomputers.com/

12. Posted By: Nik - - March 26, 2013, 11:37 am
Jeff, sorry, I tried to edit my comment but didn't want to spam your blog with comments. I was missing a comma between two of the attributes and this was throwing it off. Once I fixed that it worked perfectly!

Regards,

Nik

13. Posted By: Hardik Sondagar - - May 10, 2013, 6:12 am
IF anyone looking for similar things with source code.
then check this www.htmltagstripper.com


14. Posted By: Leon Cheng - jq153387@gmail.com - June 6, 2013, 8:32 pm
CJHTML can help you get a nice clean HTML

cjhtml.citiar.com/

chrome online store

chrome.google.com/webstore/detail/cjhtml/ekcpokmjjfacpjjcpnkpdihjjpiphoph?hl=zh-TW&utm_source=chrome-ntp-launcher

Editor Note: The html cleaner CJHTML mentioned in this post seems to do the opposite of HTML stripper. Rather than specifying what html tags and attributes you want to keep, you specify which HTML tags and attributes you want stripped with CJHTML


15. Posted By: V3.in.th - - October 22, 2013, 11:50 am
working great



16. Posted By: infinitebuzz - - November 13, 2013, 6:58 pm
Awesome tool. You just saved me a lot of time cleaning up a joomla mess!

17. Posted By: Kamal - - November 14, 2013, 6:17 pm
Great online tool!

18. Posted By: Lrobinson - - January 9, 2014, 6:35 am
Love this tool! Just what I was looking for! THanks!

19. Posted By: Cody - - January 24, 2014, 1:04 pm
Just wanted to thank you for this amazing tool! I have been searching everywhere for something like this. Thanks!

20. Posted By: Praxis - - January 31, 2014, 3:48 pm
THANK YOU for this tool. I finally gave up HTML I was cleaning and looked for a styling stripper and found this tool.

Much appreciated!

21. Posted By: Sparky Ppop - - February 25, 2014, 4:46 am
First - fantastic tool! Takes care of 85% of the MS Word code I have to manage. Any chance you would open source the code? If not I understand, but it never hurts to ask. Thanks again!

22. Posted By: Steve Webb - - March 27, 2014, 10:34 am
Wish I had found this sooner! Great tool. Quick. Does *exactly* what I want it to do. Thank so much for making this available.

23. Posted By: Alex Branning - - April 3, 2014, 8:51 am
Thank you or this tool. Do you have a Paypal account we can send a "thank you" to?

24. Posted By: Jeff - - April 3, 2014, 11:56 am
Hi Alex,

Thanks for the comment. You can send a Paypal "thank you" to this email address: jeffsbaker@sbcglobal.net

Jeff
www.seabreezecomputers.com/

25. Posted By: Attila - - May 12, 2014, 8:18 am
Thanky you! Grait page for generated html files.

26. Posted By: KAI CHUNG - - June 1, 2014, 7:18 am
Greatest Tool ever

27. Posted By: havill - - June 30, 2014, 10:50 pm
Well done. I don't want a regex that only works with a certain language to be embedded or a certain editor. This is a tool that will even with with a Chromebook and a wordpress/blogger editor.

28. Posted By: Iwan Gabovitch - - August 13, 2014, 4:36 am
Thank you very much, libre office calc html export is a mess as well and this is a great help!

29. Posted By: Bryan - - September 11, 2014, 11:14 pm
This is the best one I've found

30. Posted By: Bob - - September 19, 2014, 6:39 am
Excellent! This works great. I love the ability to specify the tags and attributes to leave. Good job!

31. Posted By: David Hoffman - - September 22, 2014, 2:44 pm
How about allowing , , , etc. tags? They're semantically meaningful, and current MS Office apps will create those tags.

Not sure if current MS Word renders HTML5 tags such as , but to be future proof you could white-list those too.

32. Posted By: Jeff - - September 22, 2014, 7:18 pm
Hi David Hoffman,

I can't see which html tags you are mentioning because they were stripped out of the comment. If you want us to see the tags then include them in the comment without the greater than and lesser than symbols.

But in the html stripper you can put whatever tags you don't want stripped out in the "Allowed Tags" field.

Jeff
www.seabreezecomputers.com/

33. Posted By: webbystripper - owen_francis@hotmail.com - November 17, 2014, 3:40 am
Hi,

I have been using this quite a bit recently and I was gutted to discover that it's broke! There is no return for the stripped output anymore!

34. Posted By: Jeff - - November 17, 2014, 4:09 pm
Hello Owen,

The html stripper is working fine at the moment. I'm not sure why it would stop working for you. Maybe there is a browser problem or the html you are submitting is too large. But then if it is too large then the excess would just be cut off.

Jeff
www.seabreezecomputers.com/

35. Posted By: SB - - January 18, 2015, 3:25 pm
Cool tool!