[Hidden-tech] MS Word to HTML Code

Jan Werner jwerner at jwdp.com
Tue Mar 2 18:09:33 EST 2010


Converting a 300-page Word document to a single HTML page is not a good 
idea, but breaking it into multiple sections requires some planning 
because there is no single best way to do so. Novels, textbooks, 
reference books, how-to books, etc., all require different approaches to 
structuring the document, tables of contents, indices, and if there are 
graphical or tabular elements, preserving and referencing them.

If you want to preserve formatting, you should convert to pdf, not to 
HTML, because browsers and ebook readers are designed to reflow text to 
fit a screen window to facilitate reading. Most ebook creation software 
uses HTML as the intermediate step in creating an ebook, and you can 
generally capture that HTML file for other purposes.

I've used Mobipocket Creator to create ebooks in Mobi Reader format from 
pdf documents, but it also accepts Word and plain text input. The first 
step in the process converts the original document to an HTML file 
(XHTML 1.0 Strict). This file, from which the ebook is built by adding 
metadata for navigation, is preserved in the ebook creation directory 
and may be rather large, but it can certainly be retrieved and edited in 
your favorite web editor.

Mobipocket eBook Creator can be downloaded for free from:

http://www.mobipocket.com/en/downloadSoft/ProductDetailsCreator.asp

Mobipocket is a French company that was bought by Amazon in 2005 and the 
Mobipocket software forms the basis of the Kindle ebook system.

Jan Werner
__________


Rich wrote:
>     ** Be sure to fill out the survey/skills inventory in the member's area.
>     ** If you did, we all thank you.
>
>
>
>
>
> Many people have responded to Claudia -- with the basically obvious answer
> MSword can do it - WELL ...
>
> Not really - on 2 counts:
>
> 1) a 300 page book is not same thing that the 'save as word' can handle
> properly - it will make a html document that exceeds the limits of any
> browser
> and clearly any reader -- so what is needed is something that handles
> paging and a TOC
>
> 2) as some have commented on the html is REALLY terrible - word tries
> for exactly what the page looks like
> in word - not something html is very good at -- so the code is a night
> mare and very browser display intensive.
>
> SO anyone have an idea of a practical solution ?
>
> I am famility with systems like DocBook (http://www.docbook.org/) that
> has many output modes
> not sure if it can start with a Word Doc - even in rtf form.
>
> Any other ideas
>
> Rich
>
> PS  -- I hope those that were blocked appreciate that too many of the
> same answer has to be prevented
> and since you don't know who else answered - it's up to me (list
> moderator) to filter
>
> On 3/2/2010 11:22 AM, Jeffrey Peck wrote:
>>
>>
>> If you are using Word 97 or newer,  there should be an option to "Save
>> as HTML".  The following link provides some detail on what to expect
>> and some issues that may be encountered:
>> http://www.temple.edu/cs/web/wordconvert.html
>>
>> Perhaps some other Hidden-Tech users have some good/bad experience
>> with this feature?
>>
>> - Jeff
>>
>> On Mar 2, 2010, at 10:00 AM, Claudia Gere wrote:
>>
>>>   ** Be sure to fill out the survey/skills inventory in the member's
>>> area.
>>>   ** If you did, we all thank you.
>>>
>>>
>>> I’m looking for the easiest/cleanest way to turn a Microsoft Word
>>> document (a 300-page book, text with simple formatting, no photos)
>>> into HTML code. Does anyone have experience using an application
>>> (free or fee) for this purpose?
>>> Thank you, Claudia
>>> Claudia Gere & Co. LLC
>>> Helping smart people become outstanding authors™
>>> Produce, Publish, Promote
>>> Follow me on Twitter: @claudiagere
>>> Aspiring Authors Workshops
>>> www.claudiagereco.com/Workshop.html
>>> <http://www.claudiagereco.com/Workshop.html>
>>> Claudia at ClaudiaGereCo.com <mailto:Claudia at ClaudiaGereCo.com>
>>> www.ClaudiaGereCo.com <http://www.claudiagereco.com/>
>>> +1 413 259 1741
>>> _______________________________________________
>>> Hidden-discuss mailing list - home page: http://www.hidden-tech.net
>>> Hidden-discuss at lists.hidden-tech.net
>>> <mailto:Hidden-discuss at lists.hidden-tech.net>
>>>
>>> You are receiving this because you are on the Hidden-Tech Discussion
>>> list.
>>> If you would like to change your list preferences, Go to the Members
>>> page on the Hidden Tech Web site.
>>> http://www.hidden-tech.net/members
>>
>>
>> _______________________________________________
>> Hidden-discuss mailing list - home page:http://www.hidden-tech.net
>> Hidden-discuss at lists.hidden-tech.net
>>
>> You are receiving this because you are on the Hidden-Tech Discussion list.
>> If you would like to change your list preferences, Go to the Members
>> page on the Hidden Tech Web site.
>> http://www.hidden-tech.net/members
>
> --
> Rich Roth
> CEO On-the-net
>
> Bringing you complex online systems since the net was young
> http://www.tnrglobal.com  - Blog:http://www.rizbang.com
> Helping move the world:http://www.earththrives.com
>
>
>
> _______________________________________________
> Hidden-discuss mailing list - home page: http://www.hidden-tech.net
> Hidden-discuss at lists.hidden-tech.net
>
> You are receiving this because you are on the Hidden-Tech Discussion list.
> If you would like to change your list preferences, Go to the Members
> page on the Hidden Tech Web site.
> http://www.hidden-tech.net/members


Google

More information about the Hidden-discuss mailing list