Cool HTML Tag and Attribute Online Stripper

Created: April 22, 2011
Last Modified: December 4, 2017
Subscribe to Internet Tips and Tools Feed
f Share
-
G+ Share
-
Tweet
-
in Share
-
P in it
-

I created this HTML stripper mainly because of the mess that programs like Word and Excel make out of HTML when you try to save a document as html or convert it to html. They make some very messy redundant html code. So this online HTML tag and attribute stripper removes all tags and attributes except for the ones you specify to allow.


Allowed Tags:

Allowed Attributes:

Also check out:
HTML to BBCODE Converter
HTML Entity to Text Converter
Sea Breeze Computers Home Page

Copyright © 2011 by Jeff Baker

History

6/12/2017 - ver. 1.1h - Possibly fixed recent Chrome error/bug:

This page isn't working
	Chrome detected unusual code on this page and blocked it to protect your personal information 
	(for example, passwords, phone numbers, and credit cards).
	Try visiting the site's homepage.
	ERR_BLOCKED_BY_XSS_AUDITOR
This error was appearing when Chrome detects script or form tags and some other html tags in a form field. Fixed with: header('X-XSS-Protection: 0');

2/23/2015 - ver. 1.1g - Unfortunately version 1.1f of the HTML stripper caused Cyrillic characters such as абв to be converted to абв . To fix this there is now a checkbox option to either "Convert HTML Entities to characters" or not. If left unchecked then the HTML stripper will act as version 1.1f. If checked then it will allow Cyrillic characters to be displayed but will convert HTML entities into their character equivalents.

11/25/2015 - ver. 1.1f - As noticed by Liam in the comments, the HTML stripper was converting HTML entities into text. Example: © ® ™ was converted to © ® ™ . I beleive I have fixed the problem. If the fix is causing some other problems please let me know.

5/4/2013 - ver. 1.1e - Again, MS word smart quotes were not being converted! I still don't know why this is happening. So I combined the old method with the new method for converting smart quotes and again they are converting properly.

3/15/2013 - ver. 1.1d - Increased the character limit to 100,000 characters on a trial basis.

3/5/2013 - ver. 1.1c - Bug Fix - I noticed that MS Word Smart quotes or curly quotes were no longer being converted correctly. I am not sure why. Maybe it was a server upgrade. So I changed to a hex way of converting smart quotes. So now MS Word "quotes" 'single quotes' dashes - and ... should convert properly.

10/05/2011 - ver. 1.1b - Bug Fix - The stripper was not removing attributes if they were on a new line. That has been fixed. For example, previously for the statement:

<a href="www.sample.com"
onclick="javascript">
the onlcick would not be removed because it was on a newline. Now it will be removed unless onclick is in the Allowed Attributes field.

08/24/2011 - ver. 1.1 - Increased the character limit to 30000 characters. Also the HTML Stripper now removes MS Word smart quotes and dashes from the document. If you are seeing the unicode replacement character FFFD in your html document then use this html tag stripper to remove it.

04/22/2011 - ver. 1.0 - Online HTML Tag and Attribute Stripper Created. Note: The HTML source is limited to 20000 characters. Tags that do not have a corresponding closing tag are also properly closed.

Back to www.seabreezecomputers.com
Subscribe to Internet Tips and Tools Feed        

User Comments

There are 53 comments.

Displaying first 50 comments.

1. Posted By: riverstore - - July 20, 2011, 8:00 pm
Thanks for the great tool! I use it to clean MS Word HTML

2. Posted By: Rob Anderson - - December 17, 2011, 3:18 pm
Just what I was looking for - thanks for making this available!

3. Posted By: Raphael - - November 12, 2012, 5:00 pm
Thanks a lot of this great script. I just had to clean up the most ugly code know to man, I don't know how I would of done it with you.

Revolution Graphics

4. Posted By: Nathan Kinsler - - January 29, 2013, 6:43 am
Excellent tool. This has saved us a lot of time. Thanks for sharing this.

BR.

Nathan.

5. Posted By: Robert Rudolf - - February 3, 2013, 2:03 am
Thank you! I was desperate to find a tool like this. Great work.

6. Posted By: Jeremy Ratliff - - February 8, 2013, 7:48 am
Thank you for this awesome tool, I too use it to clean up nasty MS Word HTML formatting.

7. Posted By: sebastiano - - March 8, 2013, 5:56 pm
thank you very much for this, but could you remove the html code size limit? I have very big articles and your script cut them. Hope you can help.

8. Posted By: al bundy - - March 12, 2013, 9:19 pm
Thank you, but the tool can't handle 190kb.

9. Posted By: Alex - - March 20, 2013, 3:22 am
Hi Jeff!

Cool thing!
Can you give the code of this webtool to use it only in my intranet-webpage?

Thank you very much.
With best regards, Alex Golovlev.
E:mail: a_g0[at]mail[dot]ru

10. Posted By: Nik - - March 24, 2013, 9:01 am
Nice tools, but it is stripping the class attribute out of span elements, when span is an element to not strip, and id and class are attributes not to strip

11. Posted By: Jeff - - March 25, 2013, 8:00 pm
Hi Nik,

I'm not sure why that is happening to you. If I list <span> in the allowed tags and class in the allowed attributes then the html stripper does not strip them for me. Maybe you could provide some of the code you are trying and I can try and see why you are having a problem.

Jeff
www.seabreezecomputers.com/

12. Posted By: Nik - - March 26, 2013, 11:37 am
Jeff, sorry, I tried to edit my comment but didn't want to spam your blog with comments. I was missing a comma between two of the attributes and this was throwing it off. Once I fixed that it worked perfectly!

Regards,

Nik

13. Posted By: Hardik Sondagar - - May 10, 2013, 6:12 am
IF anyone looking for similar things with source code.
then check this www.htmltagstripper.com


14. Posted By: Leon Cheng - jq153387@gmail.com - June 6, 2013, 8:32 pm
CJHTML can help you get a nice clean HTML

cjhtml.citiar.com/

chrome online store

chrome.google.com/webstore/detail/cjhtml/ekcpokmjjfacpjjcpnkpdihjjpiphoph?hl=zh-TW&utm_source=chrome-ntp-launcher

Editor Note: The html cleaner CJHTML mentioned in this post seems to do the opposite of HTML stripper. Rather than specifying what html tags and attributes you want to keep, you specify which HTML tags and attributes you want stripped with CJHTML


15. Posted By: V3.in.th - - October 22, 2013, 11:50 am
working great



16. Posted By: infinitebuzz - - November 13, 2013, 6:58 pm
Awesome tool. You just saved me a lot of time cleaning up a joomla mess!

17. Posted By: Kamal - - November 14, 2013, 6:17 pm
Great online tool!

18. Posted By: Lrobinson - - January 9, 2014, 6:35 am
Love this tool! Just what I was looking for! THanks!

19. Posted By: Cody - - January 24, 2014, 1:04 pm
Just wanted to thank you for this amazing tool! I have been searching everywhere for something like this. Thanks!

20. Posted By: Praxis - - January 31, 2014, 3:48 pm
THANK YOU for this tool. I finally gave up HTML I was cleaning and looked for a styling stripper and found this tool.

Much appreciated!

21. Posted By: Sparky Ppop - - February 25, 2014, 4:46 am
First - fantastic tool! Takes care of 85% of the MS Word code I have to manage. Any chance you would open source the code? If not I understand, but it never hurts to ask. Thanks again!

22. Posted By: Steve Webb - - March 27, 2014, 10:34 am
Wish I had found this sooner! Great tool. Quick. Does *exactly* what I want it to do. Thank so much for making this available.

23. Posted By: Alex Branning - - April 3, 2014, 8:51 am
Thank you or this tool. Do you have a Paypal account we can send a "thank you" to?

24. Posted By: Jeff - - April 3, 2014, 11:56 am
Hi Alex,

Thanks for the comment. You can send a Paypal "thank you" to this email address: jeffsbaker@sbcglobal.net

Jeff
www.seabreezecomputers.com/

25. Posted By: Attila - - May 12, 2014, 8:18 am
Thanky you! Grait page for generated html files.

26. Posted By: KAI CHUNG - - June 1, 2014, 7:18 am
Greatest Tool ever

27. Posted By: havill - - June 30, 2014, 10:50 pm
Well done. I don't want a regex that only works with a certain language to be embedded or a certain editor. This is a tool that will even with with a Chromebook and a wordpress/blogger editor.

28. Posted By: Iwan Gabovitch - - August 13, 2014, 4:36 am
Thank you very much, libre office calc html export is a mess as well and this is a great help!

29. Posted By: Bryan - - September 11, 2014, 11:14 pm
This is the best one I've found

30. Posted By: Bob - - September 19, 2014, 6:39 am
Excellent! This works great. I love the ability to specify the tags and attributes to leave. Good job!

31. Posted By: David Hoffman - - September 22, 2014, 2:44 pm
How about allowing , , , etc. tags? They're semantically meaningful, and current MS Office apps will create those tags.

Not sure if current MS Word renders HTML5 tags such as , but to be future proof you could white-list those too.

32. Posted By: Jeff - - September 22, 2014, 7:18 pm
Hi David Hoffman,

I can't see which html tags you are mentioning because they were stripped out of the comment. If you want us to see the tags then include them in the comment without the greater than and lesser than symbols.

But in the html stripper you can put whatever tags you don't want stripped out in the "Allowed Tags" field.

Jeff
www.seabreezecomputers.com/

33. Posted By: webbystripper - owen_francis@hotmail.com - November 17, 2014, 3:40 am
Hi,

I have been using this quite a bit recently and I was gutted to discover that it's broke! There is no return for the stripped output anymore!

34. Posted By: Jeff - - November 17, 2014, 4:09 pm
Hello Owen,

The html stripper is working fine at the moment. I'm not sure why it would stop working for you. Maybe there is a browser problem or the html you are submitting is too large. But then if it is too large then the excess would just be cut off.

Jeff
www.seabreezecomputers.com/

35. Posted By: SB - - January 18, 2015, 3:25 pm
Cool tool!

36. Posted By: Adam - - May 13, 2015, 7:02 am
I was just googling - I'm amazed this tool exists!

37. Posted By: Przemek - - October 4, 2015, 4:25 am
Excellent tool. Thanks a lot!

38. Posted By: Jeff - - November 25, 2015, 7:06 pm
Hi Liam,

I was looking for a fix for the HTML entity problem and I think I found it. Check it out and let me know if it is working for you.

Jeff
www.seabreezecomputers.com/htmlstripper/

39. Posted By: agsamek - - December 15, 2015, 11:45 pm
Great tool. Thank you!

40. Posted By: BK - - March 25, 2016, 8:19 pm
This has been such a time-saver! I fear that you will eventually shut this down in the future. Will you ever offer an offline tool similar to this? I would definitely kick in some money if you wanted to crowd-fund it!

BK

41. Posted By: Jeff - - April 2, 2016, 11:45 am
Hi BK,

Thank you for the feedback! Actually just in the past few months revenue has not been keeping up with server costs. I haven't provided the source code for tools like this because then I would lose revenue on the webpage. But Google Adsense just has not been paying as much anymore. I'm not sure how I could crowd-fund an offline tool. Do you have any ideas? What OS do you think people would need it in?

Jeff
www.seabreezecomputers.com/

42. Posted By: BK - - October 3, 2016, 10:47 am
I'm so glad to see you've added a donation feature to your page. I happily sent you $50, since I'm need of this tool again for work. ;-)

Best regards,
BK

43. Posted By: Jeff - - October 3, 2016, 12:46 pm
Hello BK,

Thank you very much for your donation! With donations like yours we will be able to continue to host tools like this one and create new tools.

Jeff
www.seabreezecomputers.com/

44. Posted By: M.C Shin - - January 3, 2017, 8:59 pm
Hi, I donated by Paypal several weeks ago.
I'm a Korean (South Korea)

Today, I attempt to strip html contains Korean Text as per usual, but result html contains many '�'.

Why does this issue happen?

45. Posted By: Jeff - - January 3, 2017, 10:02 pm
Hello M.C Shin,

I apologize for the problem. We recently added charset tags to some of our webpages and should not have apparently. We have removed it. Please try again and let us know if it is fixed.

Jeff
www.seabreezecomputers.com/

46. Posted By: M.C Shin - - January 4, 2017, 1:30 am
Thank you for your fix!! It works well now!!

There is an additional report.
Every Korean Text in strip results with Google Chrome is changed to weird strings. (it works well with Firefox and IE )
So, I cannot help using this htmlstripper page with Firefox or IE.

1. Firefox or IE

1. 사인의 공법행위에 대한 설명으로 옳지 않은 것은? (다툼이 있는 경우 판례에 의함)



2. Chrome

1. 사인의 공법행위에 대한 설명으로 옳지 않은 것은? (다툼이 있는 경우 판례에 의함)






47. Posted By: M.C Shin - - January 4, 2017, 1:36 am
My screenshot is here

screenshot

48. Posted By: Jeff - - January 4, 2017, 1:14 pm
Hi M.C Shin,

I am not sure why Chrome does this. But I think I created a fix a year ago. Try checking the box "Convert HTML Entities to characters" and see if that fixes the problem.

Jeff
www.seabreezecomputers.com/

49. Posted By: Danny - - January 10, 2017, 1:26 am
Hi.
Great tool....but it can only handle around 2700 lines of HTML - any chance to extend this (say 10,000) as I have many large tables to "strip".

Thanks a lot for your time!
Dan

50. Posted By: Jeff - - January 10, 2017, 2:39 pm
Hi danny21,

I apologize for the inconvenience. Unfortunately, the html stripper is already up to 100,000 characters just as a trial and I am going to have to lower it in the future because people are using up so much bandwidth and I am not getting paid for it. If you would like to make a paypal donation to jeffsbaker@sbcglobal.net for an account with a larger limit then I can create one for you. It is $10 a month or $100 a year for a subscription and the limit will be 300,000 characters instead of 100,000.

Jeff
www.seabreezecomputers.com/